EMIS/ELibM Electronic Journals

Outdated Archival Version

These pages are not updated anymore. They reflect the state of 20 August 2005. For the current production of this journal, please refer to http://www.math.psu.edu/era/.

%_ **************************************************************************
%_ * The TeX source for AMS journal articles is the publishers TeX code     *
%_ * which may contain special commands defined for the AMS production      *
%_ * environment.  Therefore, it may not be possible to process these files *
%_ * through TeX without errors.  To display a typeset version of a journal *
%_ * article easily, we suggest that you either view the HTML version or    *
%_ * retrieve the article in DVI, PostScript, or PDF format.                *
%_ **************************************************************************
% Author Package file for use with AMS-LaTeX 1.2





\newcommand\NN{\mathbb N}
\newcommand\QQ{\mathbb Q}
\newcommand\RR{\mathbb R}
\newcommand\ZZ{\mathbb Z}
\newcommand\cA{\mathcal A}
\newcommand\cC{\mathcal C}
\newcommand\cE{\mathcal E}
\newcommand\cF{\mathcal F}
\newcommand\cG{\mathcal G}
\newcommand\cN{\mathcal N}

       [\hskip\labelsep \theorem@headerfont (##2)\ ##1]}%
  \item[\hskip\labelsep \theorem@headerfont ##2\ ##1\ (##3)]}}


{Powers of positive polynomials and codings of Markov chains onto
Bernoulli shifts}

% author one information
\author{Brian Marcus}
\address{IBM Almaden Research Center, 650 Harry Road, San Jose, CA 95120}

% author two information
\author{Selim Tuncel}
\address{Department of Mathematics, Box 354350, University of Washington,
Seattle, WA 98195}
\thanks{Partially supported by NSF Grant DMS--9622866}

\dateposted{June 30, 1999}
\PII{S 1079-6762(99)00066-9}
\copyrightinfo{1999}{American Mathematical Society}

\subjclass{Primary 28D20; Secondary 11C08, 05A10}

\date{January 21, 1999}

\commby{Klaus Schmidt}

We give necessary and sufficient conditions for a
Markov chain to factor onto a Bernoulli shift (i) as an eventual
right-closing factor, (ii) by a right-closing factor map, (iii) by a
one-to-one a.e. right-closing factor map, and (iv) by a regular
isomorphism. We pass to the setting of polynomials in several
variables to represent the Bernoulli shift by a nonnegative polynomial
$p$ in several variables and the Markov chain by a matrix $A$ of such
polynomials. The necessary and sufficient conditions for each of
(i)--(iv) involve only an eigenvector $r$ of $A$ and basic invariants
obtained from weights of periodic orbits. The characterizations of 
(ii)--(iv) are deduced from (i). We formulate (i) as a combinatorial
problem, reducing it to certain state-splittings (partitions) of paths
of length $n$. In terms of positive polynomial masses associated with
paths, the aim then becomes the construction of partitions so that the
masses of the paths in each partition element sum to a multiple of
$p^n$, the multiple being prescribed by $r$. The
construction, which we sketch, relies on a description of the terms of 
$p^n$ and on estimates of the relative sizes of the coefficients of $p^n$.


We will describe in this article several recent results for coding Markov
chains onto Bernoulli shifts. Adopting an approach first espoused by Parry
and Tuncel, we will associate polynomial matrices to Markov chains and
translate the jointly measure-theoretic and topological coding problems into
combinatorial ones. Bernoulli shifts will then be represented by polynomials
$p$ with nonnegative integral coefficients, and we will find ourselves
facing questions about powers of $p$: We start our exposition with a number
of results on the structure and coefficients of large powers of positive
polynomials in several variables. This material is isolated in section 2,
and readers whose interests lie with powers of polynomials may go directly
to section 2. We remark that proofs of the results in
section 2 rely, in turn, on ideas from ergodic theory, specifically, the
notion of entropy at a weight-per-symbol\  and the related equilibrium states \cite{MT2}.

It is well known from Ornstein's isomorphism theory \cite{O} that entropy is a
complete invariant of measure-theoretic isomorphism for mixing Markov
chains. In that result the isomorphism is \emph{two-sided}, meaning that it
and its inverse are allowed to use the entire past and the entire future of
a point in order to determine the present coordinate of its image. In a
one-sided isomorphism, the isomorphism and its inverse are allowed to use
the entire past, but none of the future. In \cite{AMT} Markov chains were
completely classified up to one-sided isomorphism, via an effective
algorithmic procedure. An intermediate notion is regular isomorphism. Here,
the isomorphism and its inverse are allowed to use the entire past and a
uniformly bounded amount of the future.

Ideas emanating from symbolic dynamics, in particular right-resolving and
right-closing maps of shifts of finite type (see below for definitions),
have a strong bearing on regular isomorphism and one-sided isomorphism of
Markov chains: If a right-closing map is one-to-one\  almost everywhere, then
it defines a (rather concrete) regular isomorphism between the shifts 
with respect to their measures of maximal entropy. More
generally, if the shifts of finite type are endowed with Markov measures and
the map preserves measure, then we obtain a regular isomorphism between the
Markov chains. It was shown in [BT] that two Markov chains are regularly
isomorphic if and only if there exists a Markov chain which factors onto
each of them by a right-closing map which is one-to-one\  almost everywhere.
Thus, the study of regular isomorphism largely reduces to that of
right-closing maps between Markov chains.

Likewise, two Markov chains are one-sidedly isomorphic if and only if there
exists a Markov chain which factors onto each of them by a right-resolving
map which is one-to-one\  almost everywhere. This fact serves as the first step of
the classification of \cite{AMT}. Right-closing maps form the closure of
right-resolving ones under block isomorphism, that is, measure-preserving
topological conjugacy.

We will give complete answers to the following questions for an arbitrary
Markov chain and a Bernoulli shift.
\item[(1)]Does the Markov chain eventually factor onto the Bernoulli shift by
right-closing maps?
\item[(2)]Does the Markov chain factor onto the Bernoulli shift by a
map? By a right-closing map which is one-to-one\  almost everywhere?
\item[(3)]Is the Markov chain regularly isomorphic to the Bernoulli shift?

Without consideration of measures or, equivalently, restricting attention to
measures of maximal entropy, analogues of (1)--(3) were answered for shifts
of finite type in the sequence of papers \cite{M,BMT,A}. It is natural to seek
generalizations of these results to Markov chains. The work presented here
realizes the generalizations for a Bernoulli image.

Definitions and the relationship between Markov chains and
matrices of polynomials are recalled in section 3 below. The answers to
(1)--(3) are
then presented in section 4.
Sections 5 and 6 outline their proofs.
Section 5 reduces (1)--(3) to (1) and, for
(1), formulates the problem in terms of state-splittings with certain
combinatorial properties. Section 6 is concerned with the construction of
the desired state-splitting.
Details and complete proofs may be found in \cite{MT4} and \cite{MT5}.

\section{Powers of positive polynomials}
Let $p$ be a Laurent polynomial in the variables $x_1,\dots, x_k$. For
$w=\!(w_1, \dots, w_k)\!\in\ZZ^k$, write $x^w=x^{w_1}_1\cdots x^{w_k}_k$ and
denote the coefficient of $x^w$ in $p$ by $p_w$. Then
and $p_w$ are nonzero for only finitely many $w\in\ZZ^k$. Let
$\Log(p)=\{w\in\ZZ^k:p_w\ne 0\}$. The \emph{Newton polyhedron} $W(p)$ of $p$
is given by the rational convex hull of $\Log(p)$. Denote the collection of
nontrivial faces of $W(p)$ by $\cF(p)$. For $F\in\cF(p)$, let
p_F=\sum_{w\in F}p_w\,x^w,
and think of $p_F$ as the $F$-\emph{face} of $p$.
Let $\Delta(p)$ be the subgroup of $\ZZ^k$ generated by $\{a-b:a,
b\in\Log(p)\}$, and pick any $c_p\in\Log(p)$ to distinguish a coset
$c_p+\Delta(p)$ of $\Delta(p)$.

It is easy to see that $\Log(p^n)\subset nW(p)\cap (nc_p+\Delta(p))$. If
$H\ge 0$ and $S\subset\RR^k$, we use the usual norm and distance in $\RR^k$
to write $B(S, H)=\{w\in\RR^k: \dist(w,S)0$. There exist
$\epsilon_F>0$ and $N\in\NN$ such that for $a\in\ZZ^k$ the inequality
(p^n)_{a+b}\ge K(p^n)_a
holds whenever $n\ge N$, $\frac{a}{n}\in B(F, \epsilon_F)\backslash
B(\partial F, \bar\epsilon_F)$ and $b\in(\Log(p^L)\backslash

Let $p\in\ZZ^+[x^\pm_1, \dots, x^\pm_k]$ and $F\in\cF(p)$. Let
$\bar\epsilon_F$, $\theta>0$ and $D\in\NN$. There exist $\epsilon_F>0$
and $N\in\NN$ such that
whenever $n\ge N$ and $a, b, u\in\ZZ^k$ are such that $a$,
$a+u\in\Log(p^n)$, $b\in\Delta(p_F)$,
$\frac{a}{n}\in B(F, \epsilon_F)\backslash B(\partial F,
\bar\epsilon_F)$, and $\|b\|$, $\|u\|\le D$.

Let $p\in\ZZ^+[x^\pm,\dots, x^\pm_k]$ and $F\in\cF(p)$. Let $\mye_F>0$,
$l\in\ZZ$ and $D\in\NN$. There exist $\mye_F>0$ and $K, N\in\NN$ such that
\frac{1}{K}(p^{n+l})_{a+lc_F+b}\le (p^n)_a \le K(p^{n+l})_{a+lc_F+b}
whenever $n\ge N$ and $a, b\in\ZZ^k$ are such that $\frac{a}{n}\in
B(F,\mye_F)\bs B(\partial F, \bar\mye_F)$, $b\in\Delta(p_F)$ and $\|b\|\le

Details of (\ref{thm2.1})--(\ref{prop2.4}) may be found in \cite{MT4}. In
the case of (\ref{prop2.2})--(\ref{prop2.4}), the method of proof
and the more technical results at its basis may be objects of greater
general interest. They involve the notion of entropy at a weight-per-symbol\  and
the corresponding equilibrium states, which are imported from \cite{MT2}
and strengthened in the
Bernoulli setting: Letting $W_\RR(p)$ denote the real convex hull of $\Log(p)$
and $d=\mathop{\sum}_{w\in\ZZ^k}p_w\,$, it turns out that the map
$w\mapsto s(w)$ that sends $w\in W_\RR(p)$ to the corresponding equilibrium
distribution $s(w)$, which is a probability vector of size $d$, is $C^1$ on
$W_\RR(p)$. It is also shown that, for small $\epsilon_F, \delta>0$ and
large $n$, multinomial coefficients lying in a
$\delta$-ball around $w=\frac{a}{n}$ will estimate $(p^n)_a$ uniformly
closely for $w\in B(F,\epsilon_F)\bs
B(\partial F, \bar\epsilon_F)$. From this common ground, the proof of each
of (\ref{prop2.2})--(\ref{prop2.4}) is reduced to a calculation involving
multinomial coefficients.

As far as we are aware, the $F$-faces $p_F$
were introduced by Handelman \cite{H}.

\section{Markov chains and polynomial matrices}
It is well known that a directed graph $G$ defines a shift of finite type
$(X_G, \sigma_G)$. A shift-commuting continuous surjection $\phi:X_G\to
X_H$ of shifts of finite type is known as a \emph{factor map}. The map
$\phi$ is \emph{right-closing} if left-asymptotic distinct points of $X_G$
have distinct images under $\phi$. Right-closing maps are bounded-to-one
and, assuming $G$ is irreducible, there exists $d=d(\phi)\in\NN$, called the
\emph{degree} of $\phi$, so that every doubly transitive $y\in X_H$ has
exactly $d$ pre-images. If $d=1$, the map $\phi$ is said to be one-to-one\ 
almost everywhere. For
these and other basic definitions and facts on shifts of finite type and
factor maps, see \cite{LM}.

The directed graph $G$ is \emph{weighted} if, for some multiplicative
Abelian group $\cG$, each edge $e$ of $G$ is assigned a \emph{weight}
$\wt_G(e)\in\cG$. The \emph{weight} of a path $\gamma=e_1e_2\cdots e_n$ is
the product $\wt_G(\gamma)=\wt_G(e_1)\cdots\wt_G(e_n)$. Let $\Gamma(G)$ be
the subgroup of $\cG$ generated by weights of cycles, and let $\Delta(G)$ be
the subgroup of $\Gamma(G)$ consisting of ratios $\wt_G(\gamma)/\wt_G(\eta)$
of weights of cycles $\gamma, \eta$ of the same length. For irreducible $G$
of period $d$, let $\gamma,\eta$ be cycles whose lengths satisfy
$l(\gamma)=l(\eta)+d$ and use $c_G=\wt_G(\gamma)/\wt_G(\eta)$ to distinguish
a coset $c_G\Delta(G)$. (For properties of $\Gamma$, $\Delta$, $c\Delta$
see \cite{PS,MT1}.)

The graphs we consider will always be directed, irreducible and, except for
a brief spell, weighted by monomials. Such graphs correspond to
irreducible matrices $A$ over $R^+_k$: Express each entry $A(I, J)$ as a
sum (with multiplicity) of monomials, place as many edges from $I$ to $J$ as
there are terms in this sum, and bijectively assign the monomials as
weights. Confusing $A$ with the associated graph $G(A)$, we write
$\wt_A(\gamma)$, $\Gamma(A)$, $\Delta(A)$, $c_A\Delta(A)$ for the objects
defined above for $G(A)$. For a $G(A)$-cycle $\gamma$ of length
$l=l(\gamma)$, we define the \emph{weight-per-symbol} of $\gamma$ to be
$\wps_A(\gamma)=\frac{1}{l}\Log(\wt_A(\gamma))\in\QQ^k$. The rational convex
hull of $\{\wps_A(\gamma):\gamma$ cycle of $G(A)\}$ is the \emph{weight-per-symbol\ 
polytope} $W(A)$ of $A$. For $x_1,\dots,x_k>0$, let $\beta_A(x_1, \dots,
x_k)>0$ be the spectral radius of the nonnegative real-valued matrix $A(x_1,
\dots, x_k)$. Writing $\RR^{++}$ for the positive reals, this
defines the \emph{beta function}
$\beta_A:(\RR^{++})^k\to\RR^{++}$ of $A$.

We write $(X_A, \sigma_A)$ for the shift of finite type defined by $G(A)$
together with the weights-per-symbol assigned to its periodic orbits. We
finite-to-one factor maps $\phi:X_A\to X_B$ between such systems to be
\emph{weight-preserving} in that $\wps_B(\phi(\gamma))=\wps_A(\gamma)$ for
every periodic orbit $\gamma$ of $(X_A, \sigma_A)$. The (clearly necessary)
\emph{factor periodic point condition} for $A, B$ is the requirement that,
for each periodic orbit $\gamma$ of $(X_A, \sigma_A)$,
there exist in $(X_B, \sigma_B)$ a periodic orbit of the same weight-per-symbol\  whose
period divides that of $\gamma$.

Applying the above to $p\in R^+_k$ by viewing it as a $1\times 1$ matrix,
note that there is a slight clash with the definitions of $\Gamma(p)$,
$\Delta(p)$, $c_p\Delta(p)$ given in section 2, resulting from confusing
monomials and their exponents.

Markov chains are traditionally defined by stochastic matrices. Given an
irreducible stochastic matrix $M$, we construct a graph $G(M)$ taking
weights in $\RR^{++}$: The vertices of $G(M)$ are given by the indexing set
of $M$, and there is an edge of weight $M(I, J)$ from $I$ to $J$ whenever
$M(I, J)>0$. The shift of finite type $(X_M, \sigma_M)$ defined by $G(M)$
and the Markov measure $\mu_M$ defined by $M$ give us the \emph{Markov
chain} $(X_M, \sigma_M, \mu_M)$. We will write $\wt_M(\gamma)$,
$\Gamma(M)$, $\Delta(M)$, $c_M\Delta(M)$ for the objects defined for the
weighted graph $G(M)$. For $t\in\RR$, let $\beta_M(t)$ be the spectral
radius of the matrix $M^{(t)}$ with
M^{(t)}(I, J) = \left\{\begin{array}{ll}
0 &\hbox{if }M(I, J)=0,\\
M(I, J)^t &\hbox{if }M(I, J)>0.
This defines the \emph{beta function} $\beta_M:\RR\to\RR^{++}$ of $M$.

Letting $Q$ also be an irreducible stochastic matrix, we require factor maps
between $(X_M, \sigma_M, \mu_M)$ and $(X_Q, \sigma_Q, \mu_Q)$ to be
measure-preserving. In particular, $(X_Q, \sigma_Q, \mu_Q)$ is a
\emph{right-closing factor} of $(X_M, \sigma_M, \mu_M)$ if there exists a
right-closing map $\phi$ of the shift of finite type $(X_M, \sigma_M)$ onto
$(X_Q,\sigma_Q)$ such that $\mu_M\circ\phi^{-1}=\mu_Q$. The Markov chain
$(X_Q, \sigma_Q, \mu_Q)$ is a \emph{right-closing eventual factor} of $(X_M,
\sigma_M, \mu_M)$ if $(X_Q, \sigma^n_Q, \mu_Q)$ is a right-closing factor of
$(X_M, \sigma^n_M, \mu_M)$ for all large $n$.

Suppose the rows of $Q$ are identical, so that $(X_Q, \sigma_Q, \mu_Q)$ is a
\emph{Bernoulli shift}. Also suppose $\beta_M=\beta_Q$, since this equality
is necessary for the coding problems under consideration. Then, by
(6.1) of \cite{MT1}, $\Gamma(M)\subset\Gamma(Q)$. Pick a basis $a_1, \dots,
a_k$ of the finitely generated Abelian subgroup $\Gamma(Q)$ of $\RR^{++}$.
Upon conjugating $M, Q$ by suitable diagonal matrices, we obtain matrices
whose entries lie in $\Gamma(Q)$. Hence, each nonzero entry of $M, Q$ can be
uniquely expressed as a product of integral powers of $a_1, \dots, a_k$.
Replacing each $a_i$ by an indeterminate $x_i$, we get from $M, Q$
irreducible matrices $A, B$ over $R^+_k$. A finite-to-one factor map $\phi$
of the shift of finite type $(X_M,\sigma_M)$ onto $(X_Q, \sigma_Q)$
satisfies $\mu_M\circ\phi^{-1}=\mu_Q$ if and only if $\phi$
is a (weight-preserving) factor map of $(X_A, \sigma_A)$ onto
$(X_B, \sigma_B)$. (See section 2 of \cite{T} for details and credits.)
Moreover, since the rows of $Q$ are identical, we can put
$p=\mathop{\sum}_{J}B(I, J)$ for any row $I$ to identify $(X_B,
\sigma_B)$ with $(X_p, \sigma_p)$.

Thus, we are in the position of having $p\in R^+_k$
and an irreducible $R^+_k$-matrix $A$ with $\beta_A=p$ and
studying weight-preserving maps of
$(X_A, \sigma^n_A)$ onto $(X_p, \sigma^n_p)$. The equality $\beta_A=p$
implies that $W(A)=W(p)$, the Newton polyhedron of $p$. Consider
$F\in\cF(p)$. Define a subgraph of $G(A)$ by retaining an edge $e$ of $G(A)$
(and its weight $\wt_A(e)$) if and only if there exists a $G(A)$-cycle
$\gamma$ which contains $e$ and has $\wps_A(\gamma)\in F$. The resulting
subgraph is nonwandering; that is, it consists only of irreducible
components, which we call the \emph{$F$-components} of $A$. The beta
function of an $F$-component either equals $p_F$ or it is strictly less than
$p_F$ on all of $(\RR^{++})^k$; the $F$-components in the former category
are said to be \emph{principal}. (See \cite{MT1,T} for the details.)
$F$-components of $A$ will be central to our solutions of (1)--(3) of the
introduction. Note that when
$F=W(p)$, we have a single (principal) $W(p)$-component, namely, $A$ itself.

\section{The coding results}
To formulate the conditions that will feature in the solutions of
(1)--(3), let $p\in R^+_k$ and let $F$ be a proper face of
$W(p)$. A polynomial $q\in R_k$ will be said to \emph{$F$-conform} to $p$ if
there exists $n\in\NN$ so that for some $u\in\ZZ^k$ we have
\Log(q)+u\subset\Log(p^n)\hbox{ and }(\Log(q)+u)\cap\Log(p^n_F)\ne\emptyset;
in this case we will put $\Log_F(q)\!=\!\Log(q)\cap (\Log(p^n_F)-u)$ and
$q_F \!=\!
\mathop{\sum}_{w\in \Log_F(q)}\,q_w x^w$. Roughly speaking, $F$-conformity
asks that, at and near $F$, both the ``shape'' and the ``gap structure'' of
$\Log(q)$ be subordinate to $\Log(p^n)$.
For a group $\Delta$ of monomials in $R_k$, we put $R(\Delta)=\{q\in

Let $A$ be an irreducible $R^+_k$-matrix with $\beta_A=p$. By a
\emph{right eigenvector} of $A$ we will mean a nontrivial vector $r$ over
$R_k$ such that $Ar=pr$. Our results will require a right eigenvector which
satisfies three conditions. The first two are:
\item[\textbf{(a)}]The entries of $r$ lie in $R^+_k$.
\item[\textbf{(b)}]For each proper face $F$ of $W(p)$ and each state $I$ that
to an $F$-component of $A$, the entry $r(I)$ $F$-conforms to $p$.

The existence of a right eigenvector satisfying (a) was proved in
\cite{MT3}. For the description of the third condition, let $F\in\cF(p)$,
let $\cC$ be a principal $F$-component of $A$, and let $A_\cC$ be the matrix
representing $\cC$. Using (b), let $r_\cC$ be the vector indexed by the
states of $\cC$ and having $r_\cC(I)=r(I)_F$ for $I\in\cC$. We then find a
monomial $m$ and a diagonal matrix $D$ with monomials for its diagonal
entries such that $\widetilde p_F=mp_F\in R(\Delta(p_F))$, $\widetilde
A_\cC=mDA_\cC D^{-1}$ is over $R(\Delta(A_\cC))\subset R(\Delta(p_F))$,
$\widetilde r_\cC=Dr_\cC$ is over $R(\Delta(p_F))$ and $\widetilde
A_\cC\widetilde r_\cC=\widetilde p_F\widetilde r_\cC$. Put
$R_{p_F}=R(\Delta(p_F))[\frac{1}{\widetilde p}]$, the ring obtained upon
adjoining $ \frac{1}{\widetilde p}$ to $R(\Delta(p_F))$.
Letting $d$ denote the period of $\cC$, we have
the subring $R_{A_\cC}=R(\Delta(A_\cC))[\frac{1}{\widetilde p{}^d}]$ of
$R_{p_F}$. View $R_{p_F}$ as an $R_{A_\cC}$-module. For a period class $C$
of $\cC$, let $\widetilde r_C$ be the restriction of $\widetilde r_\cC$ to $C$
and consider the $R_{A_\cC}$-submodule $\langle r_C\rangle$ of $R_{p_F}$
generated by the entries of $\widetilde r_C$. Our third requirement is that
$\lan r_C\ran$ equal $R_{p_F}$:
\item[\textbf{(c)}]If $F\in\cF(p)$ and $C$ is a period class of a principal
$F$-component $\cC$ of $A$, then the $R_{A_\cC}$-module $\lan r_C\ran$
equals $R_{p_F}$.

The following results answer (1)--(3) of the introduction.

Let $(X_M,\s_M,\mu_M)$ be a Markov chain and $(X_Q,\s_Q,\mu_Q)$
a Bernoulli shift. Then $(X_Q,\s_Q,\mu_Q)$ is a right-closing eventual
factor of $(X_M,\s_M,\mu_M)$ if and only if $\myb_M=\myb_Q$ and, considering the
$R^+_k$-matrix $A$ and polynomial $p\in R^+_k$ associated with $M,Q$, there
exists a right eigenvector $r$ of $A$ satisfying (a)--(c).

Let $(X_M,\s_M,\mu_M)$ be a Markov chain and
$(X_Q,\s_Q,\mu_Q)$ a Bernoulli shift. Then $(X_Q,\s_Q,\mu_Q)$ is a
right-closing factor of $(X_M,\s_M,\mu_M)$ if and only if $\myb_M=\myb_Q$, and,
considering the $R^+_k$-matrix $A$ and polynomial $p\in R^+_k$ associated
with $M,Q$, there exists a right eigenvector $r$ of $A$ satisfying (a)--(c)
and the factor periodic point condition holds for $A,p$. Moreover,
$(X_Q,\s_Q,\mu_Q)$ is a factor of $(X_M,\s_M,\mu_M)$ by a right-closing map
of degree $1$ if and only if, in addition, $M$ is aperiodic, $\myD(M)=\myD(Q)$ and

The \emph{past $\sigma$-algebra} $\cA_M$
of $(X_M, \sigma_M, \mu_M)$ is generated by
the cylinder sets
[e_0, e_1, \dots, e_l]_h = \{x=(x_n)\in X_M:x_h=e_0, x_{h+1}=e_1,\dots,
with $l \in \ZZ^+$, $h\in \ZZ$ and $h+l\le 0$.
Markov chains $(X_M, \sigma_M, \mu_M)$ and $(X_Q,\sigma_Q, \mu_Q)$
are said to be \emph{regularly isomorphic} if there exists a
measure-theoretic isomorphism $\Phi:X_M\to X_Q$ such that
$\Phi^{-1}(\cA_Q)\subset\sigma^{-N}_M(\cA_M)$ and
$\Phi(\cA_M)\subset\sigma_Q^{-N}(\cA_Q)$ for some
nonnegative integer $N$.

Let $(X_M,\s_M,\mu_M)$ be a Markov chain and
$(X_Q,\s_Q,\mu_Q)$ a Bernoulli shift.
The following are equivalent.
\item[(i)]$(X_M,\s_M,\mu_M)$ and $(X_Q,\s_Q,\mu_Q)$ are regularly
\item[(ii)]There exists a Markov chain $(X_R,\s_R,\mu_R)$ and right-closing
factor maps $\p:(X_R,\s_R,
\mu_R)\to(X_M,\s_M,\mu_M)$ and
$\ps:(X_R,\s_R,\mu_R)\to(X_Q,\s_Q,\mu_Q)$ of degree $1$.
\item[(iii)]$M$ is aperiodic, $\myb_M=\myb_Q$, $\myD(M)=\myD(Q)$,
$c_M\myD(M)=c_Q\myD(Q)$ and, considering the $R^+_k$-matrix $A$ and polynomial
$p\in R^+_k$ associated with $M,Q$, there exists a right eigenvector $r$ of
$A$ satisfying (a)--(c).

The equivalence of (i) and (ii) of (\ref{thm4.3}) is a special case of the
main result of \cite{BT}. The equivalence of (ii) and (iii) is deduced from
(\ref{thm4.2}) with the aid of discrete geometry and techniques adapted from
symbolic dynamics.

\section{Combinatorial formulation}
We have explained that (\ref{thm4.1}) and (\ref{thm4.2}) may be regarded as
statements on right-closing maps of $(X_A, \sigma^n_A)$ onto $(X_p,
\sigma^n_p)$. Taking our cues from \cite{BMT,PT}, we formulate such maps in
terms of state-splittings. (See \cite{LM} for more on state-splittings, which
were first employed by Williams.)

Consider an $R^+_k$-matrix $B$. A \emph{state-splitting} of $(X_B,
\sigma_B)$ results from a collection of partitions: For every state $I$ of
$G(B)$, the set of edges starting at $I$ is partitioned into sets $(I, 1),
(I, 2), \dots, (I, \rho(I))$. For each $(I, i)$ form a vector
$\alpha^{(I, i)}$ indexed by the states of $G(B)$
by letting the $J$th entry $\alpha^{(I, i)}(J)$ be given by
 the sum of the weights $\wt_B(e)$ of the edges $e$
that belong to $(I, i)$ and terminate at $J$. The sum
$\mathop{\sum}^{\rho(I)}_{i=1}\,\alpha^{(I, i)}$ then equals the $I$th
row of $B$. State-splittings of $(X_B, \sigma_B)$ correspond in this way to
partitions of rows of $B$ into vectors over $R^+_k$, leading us to also
speak of \emph{(state-)splitting} $B$. Putting $\widehat B((I, i), (J,
j))=\alpha^{(I, i)}(J)$, the partition elements index a matrix
$\widehat B$ such that $(X_{\widehat B}, \sigma_{\widehat B})$ is block
isomorphic to $(X_B, \sigma_B)$.

Fix $p\in R^+_k$ and an irreducible $R^+_k$-matrix $A$ with $\beta_A=p$.
When $Ar=pr$ and the vector $r$ is over $R^+_k$, we will seek
state-splittings of a particular form: Writing each entry $r(I)$ as a sum of
r(I)=\sum^{\rho(I)}_{k=1}x^{w(I, i)},
we will seek to partition the $I$th row of $A^n$ into $\rho(I)$ vectors
$\alpha^{(I, i)}$, $1\le i\le \rho(I)$, such that $\alpha^{(I, i)}\cdot
r=x^{w(I, i)}p^n$. When we have been able to do this, we say that we have a
\emph{splitting of $A^n$ according to $r$}. This notion derives its
significance from the following.

Let $A$ be an irreducible $R^+_k$-matrix with $\myb_A=p\in R^+_k$. The
following are equivalent.
\item[(i)]$(X_{p^n},\s_{p^n})$ is a right-closing factor of $(X_{A^n},
\s_{A^n})$ for some $n\in\NN$.
\item[(ii)] $A$ has a right eigenvector $\,r$ over $R^+_k$ such that,
for some $N\in\NN$, the matrix $A^N$ can be split according to $r$.
\item[(iii)] $A$ has a right eigenvector $\,r$ over $R^+_k$ such that
$A^n$ can be split according to $r$ for all large $n$.
\item[(iv)]$(X_p,\s_p)$ is a right-closing eventual factor of $(X_A,\s_A)$.

With both (\ref{thm4.1}) and (\ref{thm4.2}), the necessity of the conditions
is quickly established with the aid of (\ref{thm5.1}). Theorem
(\ref{thm5.1}) also enables us to use \cite{A} to deduce from (\ref{thm4.1})
the sufficiency of the conditions of (\ref{thm4.2}): By (\ref{thm4.1}) and
(\ref{thm5.1}), the conditions of (\ref{thm4.2}) provide splittings of $A^n$
according to $r$ for all large $n$. In \cite{A} Ashley shows how to use
analogous splittings and the related factor periodic point condition to
produce right-closing maps (of the first powers) in the case of shifts of
finite type. A right-closing map of $(X_A, \sigma_A)$ onto $(X_p,
\sigma_p)$ is constructed by going through the argument of \cite{A} with our
factor periodic point condition and splittings of $A^n$.

Thus, (\ref{thm4.1})--(\ref{thm4.3}) reduce to the construction of a
state-splitting of $A^n$ according to $r$. This construction takes up much
of \cite{MT5} (in the preprint version, 43 Latex pages out of 72). We provide
an outline of it.

The right eigenvector $r$ is said to \emph{totally conform to $p$} if, in
addition to satisfying (a)--(c), it has the property that every entry $r(I)$
$F$-conforms to $p$ for every proper face $F$ of $W(p)$.

The construction actually employs two state splittings. The first uses (b) and
(\ref{thm2.1}) to prove:

For $n$ large enough, $A^n$ may be state-split in such a way that the
resulting matrix $\widehat{A^n}$ has a right eigenvector $\widehat r$ which
totally conforms to $p$.

This is followed by the main splitting, outlined in the next section and
showing that if a matrix has a totally conforming right eigenvector, then
large powers of the matrix can be split according to that eigenvector.

\section{The main state-splitting}
By (\ref{thm5.1}) and (\ref{prop5.2}), we may now replace $A$ by a suitable
power to assume that $r$ totally conforms to $p$ and that every $F$-component
of $A$ is aperiodic for every $F\in\cF(p)$. Using (a)--(c) and total
conformity, pick $L\in\ZZ^+$, $u_{F, J}\in\ZZ^k$ and, for each principal
component $\cC$ and coset $x^u\Delta(A_\cC)\in\Delta(p_F)/\Delta(A_\cC)$,
pick $c_\cC\in\Delta(p_F)$, $D_{F, J}\in\Log(\Delta(p_F))$ and polynomials
$q^{(F, u)}(J)\in R(\Delta(A_\cC))$ such that:
\item[(1)]$\Log(r(J))+u_{F,J}\subset\Log(p^L)$ and $\Log(r(J)_F)+u_{F,J}\subset
\Log(p^L_F)$ for every $J\in S$;
\item[(2)]for each principal $F$-component $\cC$, the matrix $\widetilde A_\cC$
indexed by (the states of) $\cC$ and with
\widetilde A_\cC(J, J')=A_\cC(J, J')\, \frac{x^{u_{F,J}}\,x^{D_{F,J}}}{c_\cC\,
is over $R(\Delta(A_\cC))$;
\item[(3)] $\sum_{J\in\cC}q^{(F,u)}(J)\,x^{u_{F,J}}x^{D_{F,J}}r(J
x^up^L_F\,$ for every $F\in\cF(p)$, principal $F$-component $\cC$ and

Denote the starting and terminal states of a $G(A)$-path $\gamma$ by
$s(\gamma)$ and $t(\gamma)$. Let $W_n(I)$ be the set of $G(A)$-paths
$\gamma$ of length $n$ and with $s(\gamma)=I$. Let $W_n(I, J)$ be the set of
$\gamma\in W_n(I)$ with $t(\gamma)=J$; and $W_n(I, J, w)$ the set of
$\gamma\in W_n(I, J)$ with $\Log(\wt(\gamma))=w$. For $a\in\Log(r(I)p^n)$,
let $\pi(n,a)$ denote the probability vector of length $\rho(I)$ whose
$i$th entry equals $\pi(n, a, i)=(x^{w(I, i)}p^n)_a/(r(I)p^n)_a$. The
following lemmas make use of (\ref{prop2.3}) and

Let $\theta, \bar\epsilon_F>0$ and $D, l\in\ZZ^+$. There exist
$\epsilon_F>0$ and $N\in\NN$ such that
|\pi(n+l, a+lc_F+b, i)/\pi(n, a, i)-1|<\theta
whenever $n\ge N$, $\frac{a}{n}\in B(F, \epsilon_F)\backslash B(\partial F,
\bar\epsilon_F)$, $(p^n)_a\ne 0$, $b\in\Log(\Delta(p_F))$ and $\|b\|\le D$.

Let $F\in\cF(p)$, $\bar\epsilon_F>0$ and $\bar D\in\NN$. There exist
$\epsilon_F>0$ and $N,\bar K\in\NN$ such that, for every $n\ge N$ and
$a\in\ZZ^k$ with $\frac{a}{n}\in B(F, \epsilon_F)\backslash B(\partial
F,\bar\epsilon_F)$, we can find a principal $F$-component $\cC$, $J_1\in\cC$
and $x^u\Delta(A_\cC)\in \Delta(p_F)/\Delta(A_\cC)$ with
|W_n(I,J_1, a-u+b-Lc_F+u_{F,J_1}+D_{F,J_1})|\ge \frac{1}{\bar K}(r(I)p^n)_a
whenever $b\in\Log(\Delta(A_\cC))$ and $\|b\|\le\bar D$.

Define the \emph{mass} of $\gamma\in W_n(I)$ to be
$\mass(\gamma)=\wt(\gamma)\,r(t(\gamma))$. For $W\subset W_n(I)$, let
$\mass(W)$ equal the sum of $\mass(\gamma)$ over $\gamma\in W$. Our goal is
to partition $W_n(I)$ into sets $(I, i)$, $1\le i\le\rho(I)$, so that
$\mass((I, i))=x^{w(I, i)}p^n$. We will speak of allocating the elements of
$W_n(I, J, w)$ according to $\pi(n, w+a)$, where $a\in\Log(r(J))$: What we
mean by this is that the number of elements placed in the $i$th set differs
from (the possibly nonintegral) number $\pi(n, w+a, i)|W_n(I, J, w)|$ by
less than 1. Condition (1) guarantees that, any time we assign paths from
$W_n(I, J, w)$ to the $i$th set, we have

For the sake of simplicity, we restrict our attention to the case $k=2$.

For each $F\in\cF(p)$ and $J$, fix $a_{F, J}\in\Log(r(J)_F)$. Write
$a_J=a_{W(p), J}$. For $F\in\cF(p)$, also fix $v_F\in\ZZ^k$ with coprime
entries which exposes $F$ by having $F\cdot v_F=\min\{W(p)\cdot v_F\}$.
Write $\delta_F=\min\{W(p)\cdot v_F\}=\min\{\Log(p)\cdot v_F\}$ and
$\delta_F(J)=\min\{r(J)\cdot v_F\}$. For $F\in\cF(p)$ and $\lambda\in\ZZ$,
\cN_n(F, \lambda)=\{a\in\Log(r(I)p^n):a\cdot v_F\le\delta_F(I)+n\delta_F+
Recalling $k=2$, define $\cN_n(\lambda_0)$ to be the union of
$\cN_n(F_0,\lambda_0)$ over all vertices $F_0$ of $W(p)$ and define
where the last union is over all edges $F_1$ of $W(p)$. Let
$W_n(F, \lambda)$, $W_n(\lambda_0)$, $W_n(\lambda_0, \lambda_1)$ respectively
denote the union of the sets $W_n(I, J, w)$ over all $(J, w)$ such that
$w+\Log(r(J))$ intersects $\cN_n(F, \lambda), \cN_n(\lambda_0),
\cN_n(\lambda_0, \lambda_1)$. We drop the subscript from our notation when
the length is clear from the context.

Consider $\xi$, $D\ge 0$ and a partition of $W_n(\lambda_0, \lambda_1)$ into
sets $P(i)$, $1\le i\le \rho(I)$, satisfying two conditions:
\item[(i)]For every $a\in\cN_n(\lambda_0, \lambda_1)$ we have
$\mass(P(i))_a=(x^{w(I, i)} p^n)_a$.
\item[(ii)]If $W_n(I, J, w)\subset W_n(\lambda_0, \lambda_1)\bs
W_n(\lambda_0-D, \lambda_1-D)$, then
1-\xi\le \frac{|W_n(I, J, w)\cap P(i)|}{\pi(n, w+a_J, i)|W_n(I, J, w)|}\le
For a suitable choice of the constant $D$, we have:

For every small $\epsilon>0$ there exist $\xi\ge 0$ and $N\in\NN$ such
that, for any $n\ge N$ and $\lambda_0, \lambda_1>n\epsilon$, the existence
of a partition of $W_n(\lambda_0, \lambda_1)$ into $\rho(I)$ sets $P(i)$
satisfying (i) and (ii) implies the existence of a partition of $W_{n+L}(I)$
into $\rho(I)$ sets $(I, i)$ such that $\mass((I, i))=x^{w(I, i)}p^{n+L}$.

\begin{proof}[Sketch of proof] If $W_n(I, J, w)$ is not contained in
$W_n(\lambda_0, \lambda_1)$, allocate it according to $\pi(n, w+a_J)$. Let
$P'(i)$ be the set containing these allocations and $P(i)$. Let $\widetilde
P(i)\subset W_{n+L}(I)$ be obtained by extending the paths in $P'(i)$:
\widetilde P(i)=\{\gamma\eta:\gamma\in P'(i), \eta\in W_L(t(\gamma))\}.
Then $\mass(\widetilde P(i))=\mass(P'(i))p^L$, and $\mass(P'(i))_a=(x^{w(I,
i)}p^n)_a$ for all $a\in W_n(\lambda_0, \lambda_1)$. Consider
$a\in\Log(p^nr(I))\bs W_n(\lambda_0, \lambda_1)$. Taking $F=W(p)$,
$x^u\Delta(A) = x^a\Delta(A)$ in (3), writing $q^{(u)}(J)=q^{(W(p),u)}(J)$
and multiplying by $x^{a-u}$,
\sum_J \,x^{a-u+u_{W(p),J}+D_{W(p),J}}q^{(u)}(J)\,r(J)=x^ap^L.
Consider moving, for each $b\in\Log(q^{(u)}(J))$, exactly $q^{(u)}(J)_b$
elements of
$W_{n+L}(I, J,\\mya-u+u_{W(p),J}+D_{W(p),J}+b)$
from $\widetilde
P(j)$ to $\widetilde P(i)$. (If $q^{(u)}(J)_b<0$, we move the paths from
$\widetilde P(i)$ to $\widetilde P(j)$.) By the above equation, this adds
$x^ap^L$ to $\mass(\widetilde P(i))$ and subtracts $x^ap^L$ from
$\mass(\widetilde P(j))$. To complete the proof, one uses (ii),
(\ref{lem6.1}) and (\ref{lem6.2}) to verify that, for small $\xi$ and large
$n$, the sets $\widetilde P(j)$ contain the paths required to repeat this
``correction'' $(x^{w(I, i)}p^n)_a-\mass(P'(i))_a$ times.

A partition to which (\ref{lem6.3}) applies is constructed by induction.
For ease of exposition, we assume $L=0$. This enables us
to avoid having, as we did in the
sketch of the proof of (\ref{lem6.3}), to extend paths by $L$ each time we
use (3) to make corrections. For small $\epsilon_0>\epsilon_1>0$, the
induction will have $\epsilon_0n+\epsilon_1n$ stages. (The choices of
$\epsilon_0$, $\epsilon_1$ are indicated below.) For $\lambda\in\{1, \dots,
\epsilon_0n+\epsilon_1n\}$, we will describe the sets $P(\lambda, i)$,
$1\le i\le\rho(I)$, we have at the end of stage $\lambda$.

For $\lambda\in\{1, \dots, \epsilon_0n\}$, we only make allocations near
vertices of $W(p^n)$: We start the $\lambda$th stage having partitioned
$W(\lambda-2)=W_n(\lambda-2)$ into sets $P(\lambda-1,i)$ with
$\mass(P(\lambda-1,i))_a=(x^{w(I, i)}p^n)_a$ for all $a\in\cN(\lambda-2)$.
Then, for each vertex $F_0$, allocate according to $\pi(n, w+a_{F_0,J})$
each $W(I, J, w)$ contained in $W(F_0, \lambda-1)\bs W(\lambda-2)$.
According to (\ref{prop2.2}), by picking $\epsilon_0$ small and $n$ large,
we can be sure that for $a\in\cN(F_0, \lambda-1)\bs\cN(\lambda-2)$ the
coefficient $\mass(P(\lambda-1, i))_a$ will be an arbitrarily small fraction
of $(x^{w(I, i)}p^n)_a$. Combining this with (\ref{lem6.2}), we are assured
of the supply of paths needed to use (3) to make corrections and obtain
$P(\lambda, i)$ with $\mass(P(\lambda, i))_a=(x^{w(I, i)}p^n)_a$ for all

From $\lambda=\epsilon_0n+1$, we allocate paths near the edges of $W(p^n)$
as well. For $\lambda\in\{\epsilon_0n+1, \dots, \epsilon_0n+\epsilon_1n\}$
the $\lambda$th stage consists of $d$ steps. (The choice of $d$ will become
clear in a minute.) We start the $\lambda$th stage having partitioned
$W(\lambda-2, \lambda-\epsilon_0n-2)$ into sets $P(\lambda-1,i)$ with
$\mass(P(\lambda-1,i))_a=(x^{w(I, i)}p^n)_a$ for all $a\in\cN(\lambda-2,
\lambda-\epsilon_0n-2)$. In step 1, we allocate and correct near edges: For
each edge $F_1$, allocate according to $\pi(n, w+a_{F_1,J})$ each $W(I, J,
w)$ contained in $W(F_1, \lambda-\epsilon_0n-1)\bs W(\lambda-2,
\lambda-\epsilon_0n-2)$. Then use (3) to correct: We employ (\ref{prop2.2})
and (\ref{lem6.2}) as before, this time applying them to edges. In addition,
we use (\ref{lem6.1}) to see that errors due to the choice of $a_{F_1,J}$
may be corrected. While getting the coefficients correct for $a\in\cN(F_1,
\lambda-\epsilon_0n-1)$, step 1 may perturb those of boundedly many
$a\in\cN(\lambda-2)\bs \cN(\lambda-d)$. Steps 2 to $d-1$ make up for the
perturbations. (Step $\mu\in\{2, \dots, d-1\}$ gets $\cN(\lambda-d+\mu-1)$
correct.) Applying (\ref{prop2.2}) to edges, for small $\epsilon_1$
and large $n$, the perturbations will be small enough for us to
correct them by employing (3) near vertices. Finally, step $d$ allocates
and corrects at $a\in\cN(\lambda-1)\bs\cN(\lambda-2, \lambda-\epsilon_0n-1)$
to yield $P(\lambda, i)$ with $\mass(P(\lambda, i))_a = (x^{w(I, i)}p^n)_a$
for all $a\in\cN(\lambda-1, \lambda-\epsilon_0n-1)$.


\bibitem[A]{A}J. Ashley, Resolving factor maps for shifts of finite type
with equal entropy, \emph{Ergod.\ Th.\ and Dynam.\ Sys.} \textbf{ 11} (1991),
219--240. \MR{92d:58056}
\bibitem[AMT]{AMT}J. Ashley, B. Marcus and S. Tuncel, The classification
of one-sided Markov chains, \emph{Ergod.\ Th.\ and Dynam.\ Sys.} \textbf{ 17}
(1997), 269--295. \MR{98k:28021}
\bibitem[BMT]{BMT}M. Boyle, B. Marcus and P. Trow, Resolving maps and the
dimension group for shifts  of finite type, \emph{Mem.\ Amer.\ Math.\ Soc.}
\textbf{ 377} (1987). \MR{89c:28019}
\bibitem[BT]{BT}M. Boyle and S. Tuncel, Regular isomorphism of Markov
chains is almost topological, \emph{Ergod.\ Th.\ and Dynam.\ Sys.} \textbf{ 10}
(1990), 89--100. \MR{92i:28021}
\bibitem[H]{H}D. Handelman, Positive polynomials and product type actions
of compact groups, \emph{Mem.\ Amer.\ Math.\ Soc.}
\textbf{ 320} (1985). \MR{86h:46091}
\bibitem[LM]{LM}D. Lind and B. Marcus, \emph{An Introduction to Symbolic
Dynamics and Coding}, Cambridge Univ.\ Press, Cambridge, 1995. \MR{97a:58050}
\bibitem[M]{M}B. Marcus, Factors and extensions of full shifts,
\emph{Monatshefte  Math.} \textbf{ 88} (1979), 239--247. \MR{81g:28023}
\bibitem[MT1]{MT1}B. Marcus and S. Tuncel, The weight-per-symbol polytope
and scaffolds of invariants associated with Markov chains, \emph{Ergod.\
Th.\ and Dynam.\ Sys.} \textbf{ 11} (1991), 129--180. \MR{92g:28038}
\bibitem[MT2]{MT2}B. Marcus and S. Tuncel, Entropy at a weight-per-symbol
and embeddings of Markov chains, \emph{Invent.\ Math.} \textbf{ 102} (1990),
235--266. \MR{91k:28023}
\bibitem[MT3]{MT3}B. Marcus and S. Tuncel, Matrices of polynomials,
positivity, and finite equivalence of Markov chains, \emph{J.\ Amer.\
Math.\ Soc.} \textbf{ 6} (1993), 131--147. \MR{93e:28022}
\bibitem[MT4]{MT4}B. Marcus and S. Tuncel, On large powers of positive
polynomials in several variables, preprint.
\bibitem[MT5]{MT5}B. Marcus and S. Tuncel, Resolving Markov chains onto
Bernoulli shifts, preprint.
\bibitem[O]{O}D. Ornstein, \emph{Ergodic Theory, Randomness and Dynamical
Systems}, Yale Univ.\ Press, New Haven, 1974. \MR{56:5836}
\bibitem[PS]{PS}W. Parry and K. Schmidt, Natural coefficients and
invariants for Markov shifts, 
\emph{Invent.\ Math.} \textbf{ 76} (1984), 15--32. \MR{86b:28022a}
\bibitem[PT]{PT}W. Parry and S. Tuncel, On the stochastic and
topological structure of Markov chains, \emph{Bull.\ London Math.\ Soc.}
\textbf{ 14} (1982), 16--27. \MR{84i:28024}
\bibitem[T]{T}S. Tuncel, Faces of Markov chains and matrices of
polynomials, \emph{Contemp.\ Math.}, Vol.\ 135, Amer.\ Math.\ Soc.,
Providence, 1992, pp.\ 391--422. \MR{94m:28034}