\documentclass[12pt,reqno]{article}
\usepackage[usenames]{color}
\usepackage{amssymb}
\usepackage{graphicx}
\usepackage{amscd}
\usepackage[colorlinks=true,
linkcolor=webgreen,
filecolor=webbrown,
citecolor=webgreen]{hyperref}
\definecolor{webgreen}{rgb}{0,.5,0}
\definecolor{webbrown}{rgb}{.6,0,0}
\usepackage{color}
\usepackage{fullpage}
\usepackage{float}
\usepackage{psfig}
\usepackage{graphics,amsmath,amssymb}
\usepackage{amsthm}
\usepackage{amsfonts}
\usepackage{latexsym}
\usepackage{epsf}
\setlength{\textwidth}{6.5in}
\setlength{\oddsidemargin}{.1in}
\setlength{\evensidemargin}{.1in}
\setlength{\topmargin}{-.5in}
\setlength{\textheight}{8.9in}
\newcommand{\seqnum}[1]{\href{http://oeis.org/#1}{\underline{#1}}}
\DeclareMathOperator{\Pref}{Pref}
\DeclareMathOperator{\Fac}{Fac}
\begin{document}
\begin{center}
\epsfxsize=4in
\leavevmode\epsffile{logo129.eps}
\end{center}
\theoremstyle{plain}
\newtheorem{theorem}{Theorem}
\newtheorem{corollary}[theorem]{Corollary}
\newtheorem{lemma}[theorem]{Lemma}
\newtheorem{proposition}[theorem]{Proposition}
\theoremstyle{definition}
\newtheorem{definition}[theorem]{Definition}
\newtheorem{example}[theorem]{Example}
\newtheorem{conjecture}[theorem]{Conjecture}
\theoremstyle{remark}
\newtheorem{remark}[theorem]{Remark}
\begin{center}
\vskip 1cm{\LARGE\bf
Some Properties of Abelian Return Words
}
\vskip 1cm
\large
M. Rigo,
P. Salimov,\footnote{The second author is supported by the Russian President's grant no. MK-4075.2012.1, the Russian Foundation for Basic Research grant no. 11-01-00997 and by a University of Li{\`e}ge post-doctoral grant.}
and E. Vandomme\\
Department of Mathematics \\
University of Li{\`e}ge \\
Grande traverse 12 (B37)\\
B-4000 Li{\`ege} \\
Belgium \\
\href{mailto:M.Rigo@ulg.ac.be}{\tt M.Rigo@ulg.ac.be}
\href{mailto:E.Vandomme@ulg.ac.be}{\tt E.Vandomme@ulg.ac.be} \\
\end{center}
\vskip .2 in
\begin{abstract}
We investigate some properties of abelian return words as recently
introduced by Puzynina and Zamboni. In
particular, we obtain a characterization of Sturmian words with
nonzero intercept in terms of the finiteness of the set of
abelian return words to all prefixes. We describe this set of
abelian returns for the Fibonacci word but also for the
$2$-automatic Thue--Morse word. We also investigate the
relationship existing between abelian complexity and finiteness of
the set of abelian returns to all prefixes.
\end{abstract}
\section{Introduction}
Many notions occurring in combinatorics on words have been recently
and fruitfully extended to an abelian context. Two words $u$ and $v$
are said to be {\em abelian equivalent} if $u$ is a permutation of the letters in $v$
and usually, the corresponding concepts are defined up to such an
equivalence. This framework gives rise to many challenging questions
in combinatorics on words: what kind of information is preserved in the abelian context?
To what extent can the
classical results be applied? What kind of characterization can we obtain? For instance, consider the classical notion
of {\em factor complexity} $p_{\mathbf{x}}$ which maps an integer $n\ge 0$ to the number of distinct
factors of length $n$ occurring in an infinite word $\mathbf{x}$.
The well-known theorem of Morse--Hedlund gives a characterization of the ultimately periodic
words. See for instance
\cite{Lot1}. Sturmian words are defined by the property $p_{\mathbf{x}}(n)=n+1$ for all $n\ge 0$. The analogue to factor complexity is the {\em abelian complexity} of $\mathbf{x}$
which maps $n\ge 0$ to the number of distinct abelian classes partitioning the set
of factors of length $n$ occurring in $\mathbf{x}$. This latter notion was already introduced in the 1970's \cite{Cov}. Some other important questions in combinatorics on words such as avoiding abelian repetitions, were initiated at the same period. See for instance \cite{Dek}. See also the reference \cite{Rich} on abelian complexity which contains many relevant bibliographic pointers.
The return word is a classical notion in combinatorics on words and
symbolic dynamical systems \cite{r1,r2,r3}. For instance, Durand obtained a
characterization of primitive substitutive sequences in terms of
return words and derived sequences \cite{Dur}. Let $u$ be a recurrent
factor of $\mathbf{x}$, i.e., a factor occurring infinitely
many times in $\mathbf{x}$. A {\em return word} to $u$ is a factor
separating two consecutive occurrences of $u$. In this paper, we
consider the abelian analogue of this notion of return word. Such a
study has been recently presented by Puzynina and Zamboni during
the WORDS 2011 conference. Here we focus on different aspects of
abelian returns and we hope that our results can be seen as
complementary to those found by Puzynina and Zamboni \cite{PZ}.
The main difference is that we usually consider the set of abelian
returns with respect to all the factors of an infinite word
$\mathbf{x}$, while Puzynina and Zamboni \cite{PZ} study the set of abelian
returns with respect to each factor taken separately. In particular,
their main contribution is a characterization of
Sturmian words: {\em a recurrent infinite word is Sturmian if and only
if each of its factors has two or three abelian returns}. Sturmian
words have been extensively studied and, in particular, every Sturmian
word can be obtained by coding the orbit $(R_\alpha^n(\rho))_{n\ge 0}$
of a point $\rho$ on a circle under a rotation $R_\alpha$ by an
irrational angle $\alpha$ when the circle is partitioned in a suitable
way into two complementary intervals. The irrational $\alpha$ is
called the {\em slope} of the Sturmian word and the initial point
$\rho$ is its {\em intercept}. See for instance \cite{Lot,MH}. Many of
our results on Sturmian words rely on this definition of Sturmian
words.
This paper is organized as follows. In Section~\ref{sec:1}, we
present the main definitions and notation used in this paper. In
Section~\ref{sec:2}, we discuss the relationship with periodicity and
we prove that a recurrent word is periodic if and only if its set of
abelian returns is finite. We also construct an abelian uniformly
recurrent word which is not eventually recurrent. In
Section~\ref{sec:3}, we restrict ourselves to the set
$\mathcal{APR}_\mathbf{x}$ of abelian returns to all prefixes. In
particular, this set is finite for any uniformly recurrent and abelian
periodic word. We study the special case of the Thue--Morse word
$\mathbf{t}$ \cite{astm} and show that the set of abelian returns to
all prefixes of $\mathbf{t}$ contains 16 elements. Next, we obtain a
characterization of Sturmian words with (non)zero intercept as
follows. Let $\mathbf{x}$ be a Sturmian word coding an orbit
$(R_\alpha^n(\rho))_{n\ge 0}$. The set $\mathcal{APR}_\mathbf{x}$ of
abelian returns to the prefixes of $\mathbf{x}$ is finite if and only
if $\mathbf{x}$ does not have a null intercept (see
Theorem~\ref{thmSturmianImplF}). The celebrated Fibonacci word
$\mathbf{f}$ can be defined with a slope and an intercept both equal
to $1/\tau^2$ where $\tau$ is the Golden mean. Therefore our result
implies that $\mathcal{APR}_\mathbf{f}$ is finite. We show that this
set contains exactly $5$ elements. Interestingly, our developments can
be related to the lengths of the palindromic prefixes of $\mathbf{f}$. See for instance
\cite{luca,Fis}. By contrast the set of abelian returns to all
prefixes for the word ${\tt 0}\mathbf{f}$ is infinite. Then we show
that if $\mathbf{x}$ is an abelian recurrent word such that
$\mathcal{APR}_{\mathbf{x}}$ is finite, then $\mathbf{x}$ has bounded
abelian complexity. In the last section of this paper, we introduce
the notion of abelian derived sequence. If a word $\mathbf{x}$ is uniformly recurrent, then
$\mathbf{x}$ can be factored in terms of abelian returns to a given
prefix of $\mathbf{x}$. This gives rise to a coding that allows one to
define a new sequence. Contrary to the non-abelian case and the
characterization obtained by Durand, the
Thue--Morse word is an example of word having infinitely many derived sequences.
\section{Preliminaries}\label{sec:1}
An infinite word $\mathbf{x}$ is said to be
{\em recurrent} if every factor $u$ of $\mathbf{x}$ appears
infinitely often in $\mathbf{x}$. An infinite word $\mathbf{x}$ is said to be
{\em uniformly recurrent}, if it is recurrent and the distance between any two consecutive
occurrences of any of its factors $u$ is bounded by a constant depending only on $u$. The language of all the finite factors (resp. prefixes) of an infinite word $\mathbf{x}$ is denoted by $\Fac(\mathbf{x})$ (resp. $\Pref(\mathbf{x})$). Let $i,j$ be such that $i\le j$. The factor $x_ix_{i+1}\cdots x_j$ of $\mathbf{x}=x_0x_1\cdots$ is denoted by $\mathbf{x}[i,j]$. The notation $\mathbf{x}[i,i]$ is shortened to $\mathbf{x}_i$.
\subsection{Return words}\label{sec:21}
The classical notion of return word
has been used by Durand \cite{Dur} but was previously introduced by Boshernitzan
\cite{Bos} (see also \cite{Gri} for the notion of induced
transformation in a dynamical context). Let $u$ be a prefix of a
uniformly recurrent word $\mathbf{x}$. A nonempty factor $w$ of
$\mathbf{x}$ is a {\em return word} to $u$, if there exists some $i\ge
0$ such that
\begin{itemize}
\item $\mathbf{x}[i,i+|w|-1]=w$,
\item $\mathbf{x}[i,i+|u|-1]=u=\mathbf{x}[i+|w|,i+|w|+|u|-1]$,
\item $\mathbf{x}[i+j,i+j+|u|-1]\neq u$ for all $j\in\{1,\ldots,|w|-1\}$.
\end{itemize}
We denote by
$\mathcal{R}_{\mathbf{x},u}$ the set of return words to $u$. Since $\mathbf{x}$ is uniformly recurrent, this set is finite because the length of the longest return word to $u$ is bounded by the maximal distance between two successive occurrences of $u$. If we order the return words to $u$ with respect to their first occurrence in $x$, then the corresponding map is $\Lambda_{\mathbf{x},u}:\mathcal{R}_{\mathbf{x},u}\to\{{\tt 1},\ldots,\#(\mathcal{R}_{\mathbf{x},u})\}=:{R}_{\mathbf{x},u}$. Since $\mathcal{R}_{\mathbf{x},u}$ is a code \cite{Dur}, i.e., any element in $\mathcal{R}_{\mathbf{x},u}^*$ has a unique factorization as return words to $u$, $\mathbf{x}$ can be written in a unique way as $\mathbf{x}=m_0m_1m_2\cdots$ and the {\em derived sequence} $\mathcal{D}_u(\mathbf{x})$ is an infinite word $\Lambda_{\mathbf{x},u}(m_0)\Lambda_{\mathbf{x},u}(m_1)\Lambda_{\mathbf{x},u}(m_2)\cdots$ over ${R}_{\mathbf{x},u}$.
\subsection{Abelian returns}
Recently, the notion of return words has been generalized to an
abelian framework \cite{PZ}. In this paper, we will distinguish two cases: abelian
return to a prefix and abelian return to a factor. We make such a
distinction to be able to define in the first case the abelian derived
sequence. Let us start with a few definitions.
Let $A=\{a_1,\ldots,a_k\}$ be a $k$-letter alphabet.
We denote by $|w|_{a_i}$ the number of occurrences of the letter $a_i$ in a word $w\in A^*$.
The Parikh mapping $\Psi:A^*\to\mathbb{N}^k$ is defined by $\Psi(w)=(|w|_{a_1},\ldots,|w|_{a_k})$. Let $u,v$ be two
finite words of the same length. We say that $u$ and $v$ are {\em
abelian equivalent} and we write $u\sim_{ab} v$ if $\Psi(u)=\Psi(v)$.
An infinite word $\mathbf{x}$ is {\em abelian periodic (of period
$m$)}, if it can be factored as $\mathbf{x}=u_1u_2u_3\cdots$
where, for all $i,j$, the finite words $u_i$ and $u_j$ have the same length $m$ and
are abelian equivalent. The smallest $m$ for which such a
factorization exists is called the {\em abelian period} of $\mathbf{x}$. For instance, the Thue--Morse word is an
infinite concatenation of ${\tt 01}$ and ${\tt 10}$ and is thus
abelian periodic of period $2$.
Let $\mathbf{x}$ be an infinite word. If, for each factor $u$ of
$\mathbf{x}$, there exist infinitely many $i$ such that
$\mathbf{x}[i,i+|u|-1]\sim_{ab} u$, then $\mathbf{x}$ is said to be
{\em abelian recurrent}.
If $\mathbf{x}$ is abelian recurrent and if, for each factor $u$
of $\mathbf{x}$, the distance between any two consecutive
occurrences of factors abelian equivalent to $u$ is bounded by a
constant depending only on $u$, then $\mathbf{x}$ is said to be
{\em abelian uniformly recurrent}.
\begin{remark}
Note that uniform recurrence implies obviously abelian uniform
recurrence. We will show in Proposition~\ref{pro:aur} that the
converse does not hold.
\end{remark}
\begin{definition}
Let $u$ be a prefix of an abelian
uniformly recurrent word $\mathbf{x}$. We say that a nonempty
factor $w$ of $\mathbf{x}$ is an {\em abelian return} to $u$, if
there exists some $i\ge 0$ such that
\begin{itemize}
\item $\mathbf{x}[i,i+|w|-1]=w$,
\item $\mathbf{x}[i,i+|u|-1]\sim_{ab} u \sim_{ab} \mathbf{x}[i+|w|,i+|w|+|u|-1]$,
\item $\mathbf{x}[i+j,i+j+|u|-1]\not\sim_{ab} u$, for all $j\in\{1,\ldots,|w|-1\}$.
\end{itemize}
We denote by $\mathcal{APR}_{\mathbf{x},u}$ the set of abelian returns
to the prefix $u$. Since $\mathbf{x}$ is abelian uniformly recurrent,
then the set $\mathcal{APR}_{\mathbf{x},u}$ is finite. We define the
set of abelian returns to prefixes as
$$\mathcal{APR}_{\mathbf{x}}:=\bigcup_{u\in\Pref(\mathbf{x})}
\mathcal{APR}_{\mathbf{x},u}.$$ Observe that if $\mathbf{x}$ is
uniformly recurrent, then the length of the longest element in
$\mathcal{APR}_{\mathbf{x},u}$ is bounded by the length of the
longest element in $\mathcal{R}_{\mathbf{x},u}$.
\end{definition}
We will also consider a more general situation where $u$ is
not restricted to be a prefix of $\mathbf{x}$. Puzynina and Zamboni \cite{PZ}
called this notion a {\em semi-abelian return} to the abelian class of $u$ and
the number of abelian returns is the number of distinct abelian
classes of semi-abelian returns.
\begin{definition}
If $\mathbf{x}$ is abelian recurrent and
if $u$ is a factor of $\mathbf{x}$, we can define as above the
notion of abelian return to $u$. The corresponding set
$\mathcal{AR}_{\mathbf{x},u}$ of abelian returns to $u$ is well
defined. We define the set of abelian returns as
$$\mathcal{AR}_{\mathbf{x}}:=\bigcup_{u\in \Fac(\mathbf{x})}
\mathcal{AR}_{\mathbf{x},u}.$$
\end{definition}
\begin{remark}
Let $\mathbf{x}$ be an abelian recurrent word. The set
$\mathcal{AR}_{\mathbf{x},u}$ is finite, for each factor $u$ of
$\mathbf{x}$, if and only if $\mathbf{x}$ is abelian uniformly
recurrent.
\end{remark}
\section{Finiteness of the set of abelian returns}\label{sec:2}
Puzynina and Zamboni \cite{PZ} provided a discussion between periodicity and the number of abelian returns. Here we take the finiteness of the set
of abelian returns to characterize periodicity.
\begin{theorem} Let $\mathbf{x}$ be a recurrent word. The set
$\mathcal{AR}_{\mathbf{x}}$ is finite if and only if $\mathbf{x}$
is periodic.
\end{theorem}
\begin{proof} The ``if'' part is obvious. We prove the ``only if'' part.
Suppose that $\mathcal{AR}_{\mathbf{x}}$ is finite and that
$\mathbf{x}$ is recurrent but not periodic. In this case, for
each $k$, there exists a word $u$ satisfying $|u|>k$ such that
$au,bu \in \Fac(\mathbf{x})$ for some letters $a\not=b$. Hence
there exist $i,j$ such that $i0\}$.
Let $\mathbf{y}=y_0y_1\cdots$ be the word defined by $y_j={\tt
0}$, if $j\in J$, and $y_j=t_j$ otherwise. Define
$\mathbf{x}=\varphi_{TM}(\mathbf{y})$.
The word $\mathbf{x}$ coincides with $\mathbf{y}$ almost
everywhere, except for positions from the set $2J\cup(2J+1)$.
Hence, each factor of the Thue--Morse word occurs in $\mathbf{x}$
uniformly, i.e., with bounded gaps. At the same time, the
factor $\varphi_{TM}({\tt 000})$ occurs in $\mathbf{x}$ with
strictly growing gaps. Hence $\mathbf{x}$ is not eventually
recurrent.
Let us now prove that $\mathbf{x}$ is abelian uniformly recurrent.
First we point out a property of the Thue--Morse word: for all
$d>0$ and all $a\in\{{\tt 0},{\tt 1}\}$, there exists $k$ such
that $t_k=a \neq t_{k+d}$. This property follows from the
well-known fact that the Thue--Morse word does not contain any
constant infinite arithmetical subsequence \cite{Sh}.
As $\mathbf{x}$ is abelian periodic (of period $2$), the weight
(i.e., the sum of digits) of each factor $u$ of
$\mathbf{x}$ of odd length is either $\frac{|u|+1}{2}$ or
$\frac{|u|-1}{2}$. Note that $y_i={\tt 0}$ implies
$|\mathbf{x}[2i+1,2i+|u|]|_{\tt 1}=\frac{|u|+1}{2}$ and $y_i={\tt
1}$ implies $|\mathbf{x}[2i+1,2i+|u|]|_{\tt 1}=\frac{|u|-1}{2}$.
Since ${\tt 0}$ (resp. ${\tt 1}$) occurs with bounded gaps in
$\mathbf{y}$, gaps between abelian occurrences in
$\mathbf{x}$ of a factor of odd length are bounded.
The weight of a factor $u$ of even length of $\mathbf{x}$ can take
values $\frac{|u|}{2}$, $\frac{|u|}{2}+1$ and $\frac{|u|}{2}-1$.
The first case takes place when $u$ occurs at an even
position in $\mathbf{x}$, meaning that the gaps between abelian
occurrences of $u$ of weight $\frac{|u|}{2}$ in $\mathbf{x}$ are
bounded. The last two cases take place if $u$ occurs in
$\mathbf{x}$ at an odd position $i$ and if $y_{\frac{i-1}{2}}={\tt
1}$ and $y_{\frac{i-1+|u|}{2}}={\tt 0}$ or,
$y_{\frac{i-1}{2}}={\tt 0}$ and $y_{\frac{i-1+|u|}{2}}={\tt 1}$.
Due to the mentioned property of the Thue--Morse word, there
exists $k$ such that $t_k={\tt 1}\neq t_{k+\frac{|u|}{2}}$ (resp.
$t_k={\tt 0}\neq t_{k+\frac{|u|}{2}}$) and since $\mathbf{t}$ is
uniformly recurrent, the factor $\mathbf{t}[k,k+\frac{|u|}{2}]$
occurs infinitely often with bounded gaps in $\mathbf{t}$. Hence
abelian occurrences of $u$ in $\mathbf{x}$ appear infinitely often
with bounded gaps.
\end{proof}
\section{Finiteness of the set of abelian returns to prefixes}\label{sec:3}
Contrary to the finiteness of $\mathcal{AR}_{\mathbf{x}}$, the
finiteness of $\mathcal{APR}_{\mathbf{x}}$ does not imply periodicity
nor abelian periodicity of $\mathbf{x}$. Moreover, if $\mathbf{x}$
is uniformly recurrent, it is well-known that
$$\min_{v\in\mathcal{R}_{\mathbf{x},u}} |v|\to\infty, \text{ if }
|u|\to\infty,$$ meaning that taking longer prefixes eventually leads
to longer return words. Here we show that such a result does not hold
for abelian returns to prefixes. Indeed, for the Thue--Morse word the
corresponding set $\mathcal{APR}_{\mathbf{t}}$ is finite and can be
described precisely. Such a result also holds for the Fibonacci word.
In particular, amongst the set of Sturmian words, the finiteness of
$\mathcal{APR}_{\mathbf{x}}$ characterizes Sturmian
words with nonzero intercept.
\begin{lemma}\label{lemmaAPandURImplF} If $\mathbf{x}$ is a uniformly
recurrent and abelian periodic word, then the set
$\mathcal{APR}_{\mathbf{x}}$ is finite.
\end{lemma}
\begin{proof} Let $m$ be the (minimal) abelian period of $\mathbf{x}$.
Let us find an upper bound for the length of an abelian return $u$
to a prefix $p$ of $\mathbf{x}$.
Suppose first that $|p|=mk$. In this case, due to abelian
periodicity, for all $i$, we have $\mathbf{x}[mi,m(i+k)-1]\sim_{ab} p$. Hence we get $|u|\leqslant m$.
Suppose now that $|p|=mk+\ell$, where $0<\ell0$ and $y=y_1\cdots y_n=\mathbf{t}[j,j+n-1]$ satisfies ~\eqref{eq:cond}.
We will show that the length of $w$ is at most $5$.
Recall that $\mathbf{t}[2k,2k+1]\in\{{\tt 01},{\tt 10}\}$ for all $k\ge 0$.
Assume first that $i,j$ are even. Since $\mathbf{t}[i,i+1]$ and
$\mathbf{t}[j,j+1]$ belong to $\{{\tt 01},{\tt 10}\}$, we conclude
that $\mathbf{t}[i,i+1]\sim_{ab}\mathbf{t}[j,j+1]$ and, in that
situation, we can only have an abelian return of length at most $2$.
Assume now that $i$ is odd and $j$ is even and that
$\mathbf{t}_i={\tt 0}$ (symmetric cases can be treated in the
same way). Our aim is to build the longest possible abelian return. Since
$\mathbf{t}_i={\tt 0}$ and $j$ is even, we consider
$\mathbf{t}[j,j+1]={\tt 10}$ because otherwise,
$\mathbf{t}[j,j+1]={\tt 01}$ and $w_1=y_1$ (i.e., $\mathbf{t}[i+1,j+1]\sim_{ab} \mathbf{t}[i,j]\sim_{ab} u$ and we get directly an abelian return of length $n=1$). Now
$\mathbf{t}[i,i+2]={\tt 001}$ because otherwise,
$\mathbf{t}[i,i+2]={\tt 010}$ and $w_1w_2\sim_{ab} y_1y_2$. Continuing
this way, we have $\mathbf{t}[j,j+3]={\tt 1010}$ and
$\mathbf{t}[i,i+4]={\tt 00101}$. Since $({\tt 10})^3$ is not a
factor of $\mathbf{t}$, we have $\mathbf{t}[j,j+5]={\tt 101001}$
and $\mathbf{t}[i,i+4]\sim_{ab} \mathbf{t}[j,j+4]$. In that situation,
we can only have an abelian return of length at most $5$.
The last case is when $i$ and $j$ are odd. Assume
$\mathbf{t}_i={\tt 0}$ and $\mathbf{t}_j={\tt 1}$.
We have $\mathbf{t}_i={\tt 0}$ and $\mathbf{t}_{j-1}={\tt 0}$
because $\mathbf{t}[j-1,j]={\tt 01}$. Moreover,
$z=\mathbf{t}[i+1,j-2]\in\{{\tt 01},{\tt 10}\}^*$ and thus
$v=\mathbf{t}[i,j-1]={\tt 0}z{\tt 0}$ is a word of even length such
that $|v|_{\tt 0}=2+|v|_{\tt 1}$. Therefore $v$ cannot be abelian
equivalent to a prefix $u$ of $\mathbf{t}$.
So in such a situation, we cannot have an abelian return
to some prefix of $\mathbf{t}$.
\end{proof}
\begin{proposition}
If a factor of length $n\ge 6$ of the Thue--Morse word
satisfies \eqref{eq:cond}, then $n$ is even.
\end{proposition}
\begin{proof}
Let $w=\mathbf{t}[i,i+n-1]$ and $y=\mathbf{t}[j,j+n-1]$ be factors
of $\mathbf{t}$ of length $n\ge 6$ satisfying ~\eqref{eq:cond}.
As $n\ge 6$, $i$ and $j$ are odd.
Hence, to satisfy the condition~\eqref{eq:cond}, we must have
$$
\begin{pmatrix}
\mathbf{t}[i,i+n-1]\\
\mathbf{t}[j,j+n-1]\\
\end{pmatrix}\in \begin{pmatrix}
{\tt 0}\\{\tt 1}\\
\end{pmatrix}
\left\{
\begin{pmatrix}
{\tt 01}\\{\tt 01}\\
\end{pmatrix},
\begin{pmatrix}
{\tt 10}\\{\tt 10}\\
\end{pmatrix}, \begin{pmatrix}
{\tt 01}\\{\tt 10}\\
\end{pmatrix}
\right\}^* \begin{pmatrix}
{\tt 1}\\{\tt 0}\\
\end{pmatrix}
\cup
\begin{pmatrix}
{\tt 1}\\{\tt 0}\\
\end{pmatrix}
\left\{
\begin{pmatrix}
{\tt 01}\\{\tt 01}\\
\end{pmatrix},
\begin{pmatrix}
{\tt 10}\\{\tt 10}\\
\end{pmatrix}, \begin{pmatrix}
{\tt 10}\\{\tt 01}\\
\end{pmatrix}
\right\}^* \begin{pmatrix}
{\tt 0}\\{\tt 1}\\
\end{pmatrix}.
$$
So $n$ must be even.
\end{proof}
For $n=6,8,10,\ldots,104$, with a computer search, we get the following number of factors of length $n$ satisfying \eqref{eq:cond}: $6$, $4$, $8$, $12$, $12$, $4$, $8$, $8$, $4$, $0$, $0$, $8$, $0$, $0$, $4$, $8$, $4$, $0$, $0$, $0$, $0$, $0$, $4$, $0$, $4$, $0$, $0$, $0$, $0$, $0$, $4$, $8$, $4$, $0$, $0$, $0$, $0$, $0$, $0$, $0$, $0$, $0$, $0$, $0$, $0$, $0$, $0$, $8$, $0$, $0$.
\subsection{Sturmian words}
Sturmian words form one of the most studied classes of infinite words.
It can be defined in several ways. For our purpose, the definition
in terms of rotations is convenient.
Let $C$ be the one-dimensional torus $\mathbb{R}/\mathbb{Z}$
identified with the interval $[0,1)$. As usual, we
denote by $\{x\}$ the fractional part of $x$. The \emph{rotation}
$R_{\alpha}$ defined for a real number $\alpha$ is a mapping
$C\rightarrow C$ that maps an $x$ to $\{x+\alpha\}$.
\begin{theorem}[Kronecker \cite{AS}]\label{theoremKronecker} If a
number $\alpha$ is irrational, then the set of points
$\{R^i_\alpha(\rho) \mid i\in \mathbb{N} \}$ is dense in $C$
for all initial points $\rho\in C$.
\end{theorem}
By an \emph{interval} (resp. \emph{half-interval}) of $C$ we mean a set
of points that is an image of an interval (resp. half-interval) of
$\mathbb{R}$ under operation $\{\cdot\}$. For instance, if $0\le b1-\{k\alpha\}$. In this case, using again the induction hypothesis,
$R^{-k}_\alpha (I_1) \cap I_H(k) = [1-\{(k+1)\alpha\},1)$ is
nonempty. This interval corresponds to the heavy factors of length
$k$ having an extension with ${\tt 1}$ making them the only heavy
factors of length $k+1$ in $\mathbf{x}$.
\end{itemize}
Now we are ready to prove the main part of the statement. First of
all, let us prove that, if $\mathbf{x}$ has a null intercept, then
$\mathcal{APR}_{\mathbf{x}}$ is infinite. Let $k\ge 1$ and $p$ be the
prefix of length $k$ of $\mathbf{x}$. As $\rho=0$, we have $0\in
I_p$. Since the interval $I_p$ corresponds exactly to one word $p$ which
is either light or heavy, we have $I_p \subseteq I_L(k)$ or $I_p \subseteq I_H(k)$.
As $0\in I_p$, we conclude that $I_p \subseteq I_L(k)$ using \eqref{eq:IHIL}.
In other words, we have just shown that each prefix of $\mathbf{x}$ is a light factor.
Now we show that, for all $n$, there exists a length $\ell$ such that
gaps between consecutive occurrences of light factors of length $\ell$
in $\mathbf{x}$ can be larger than $n$. Let $n\ge 1$. Define the set of points
$$S_n := \{R^i_\alpha(0) \mid
0\leqslant i \leqslant n \}$$ and denote by $d$ the minimal length of
intervals having endpoints in $S_n$. Due to Kronecker's theorem, we
can find some $\ell$ such that $|I_L(\ell)|\rho$.
Assume that there is a light factor of length $k$ occurring at
position $i$ in $\mathbf{x}$, i.e., $R^{i}_\alpha (\rho)\in
I_L(k)$. We consider two cases. If $R^{i}_\alpha (\rho)\ge D(c)$, there exists $j \in\{1,\ldots,
c\}$ such that $R^{j}_\alpha(0)=s_c$ and $\theta=1-R^{j}_\alpha(0)\in(0,D(c)]$. Hence the
point $R^{j}_\alpha(R^{i}_\alpha (\rho))=R^{i+j}_\alpha (\rho)=R^{i}_\alpha (\rho)-\theta$ belongs to
$I_L(k)$, i.e., the factor of length $k$ at position $i+j$ in
$\mathbf{x}$ is light again. If $R^{i}_\alpha (\rho)< D(c)$, there exists $j \in\{1,\ldots,
c\}$ such that $R^{j}_\alpha(0)=s_1\le D(c)<\rho/2$. Hence the
point $R^{j}_\alpha(R^{i}_\alpha (\rho))=R^{i+j}_\alpha (\rho)=R^{i}_\alpha (\rho)+s_1$ is less than $\rho$ and belongs to
$I_L(k)$.
A similar proof can be done for the case of a heavy prefix of length
$k$. Assume that $\rho\in I_H(k)$ and that, for some $i\ge 0$,
$R^{i}_\alpha (\rho)\in I_H(k)$. If $R_\alpha^i(\rho)<1-D(c)$, then
$R_\alpha^i(\rho)+s_1<1$. If $R_\alpha^i(\rho)\ge 1-D(c)$, then
$R_\alpha^i(\rho)-(1-s_c) \ge 1-2D(c)>\rho$. We can derive the same conclusion as above.
Hence the number $c$ is an upper bound
on the length of abelian returns to any prefix and therefore
$\mathcal{APR}_{\mathbf{x}}$ is finite.
\end{proof}
\begin{lemma}\label{lem:sturmper}
No Sturmian word is abelian periodic.
\end{lemma}
\begin{proof}
Proceed by contradiction and assume that
$\mathbf{x}=St(\alpha,\rho)$ is abelian periodic of period $m$
with $\alpha$ irrational. Then all factors of the kind
$\mathbf{x}[tm,(t+1)m-1]$, $t\in\mathbb{N}$, are abelian
equivalent, i.e., have the same weight. Assume that, for
all $t$, $R_\alpha^{tm}(\rho)= R_{m\alpha}^{t}(\rho)\in I_L(m)$.
But since $\alpha$ is irrational, $m\alpha$ is also irrational and
thanks to Kronecker's theorem, $\{R_{m\alpha}^{t}(\rho)\mid t\ge
0\}$ is dense in $C$ contradicting the fact that
$\{R_{m\alpha}^{t}(\rho)\mid t\ge 0\}\cap I_H(m)$ should be empty.
\end{proof}
\subsection{Finiteness of $\mathcal{APR}_{\mathbf{f}}$ for the Fibonacci word}
From Theorem~\ref{thmSturmianImplF}, since the Fibonacci word
$\mathbf{f}$ is given by $St(1/\tau^2,1/\tau^2)$, we already know that
$\mathcal{APR}_{\mathbf{f}}$ is finite. Here we exhibit exactly the
elements of this set in Theorem~\ref{the:fib}. As a first attempt,
\eqref{eq:cond} gives a necessary condition that allows one to exclude
some words as abelian returns. This condition will not be used in the
proof of Theorem~\ref{the:fib} but, interestingly, our developments can
be related to the lengths of the palindromic prefixes of $\mathbf{f}$,
\cite{luca,Fis}.
\begin{proposition}\label{propfibofinite}
For the Fibonacci word, there exist exactly two factors of length $n$
satisfying \eqref{eq:cond} if $n$ is a Fibonacci number. Otherwise, no
factor of length $n$ satisfies such a condition.
\end{proposition}
\begin{proof}
Consider two factors $x,y$ of length $n$ satisfying
\eqref{eq:cond} and occurring in
$$\mathbf{f}={\tt 010010100100}\cdots .$$
Assume that $x$ starts with ${\tt 0}$.
Then to fulfill \eqref{eq:cond}, $y$ starts with ${\tt 1}$. Since
$\mathbf{f}$ is Sturmian, for any two words of
the same length $x'$ and $y'$ which are prefixes of $x$ and $y$ respectively, we have $\left|
|x'|_{\tt 1}-|y'|_{\tt 1} \right|\le 1$. Therefore, we deduce
that $x$ and $y$ must be of the form $x={\tt 0}u{\tt 1}$ and
$y={\tt 1}u{\tt 0}$ for some $u\in\{{\tt 0},{\tt 1}\}^*$. This
means that $u$ is a bispecial factor of the Fibonacci word.
Recall that the left special factors in $\mathbf{f}$ are its
prefixes and its right special factors are the mirror images of
its prefixes \cite[Prop. 4.10.3]{CANT}. So bispecial factors of
$\mathbf{f}$ are its palindromic prefixes. If $(\ell_i)_{i\ge 1}$
denotes the increasing sequence of all lengths of palindromic
prefixes in $\mathbf{f}$, it is well-known that $(\ell_i)_{i\ge
1}=(0,1,3,6,11,\ldots)$ is given by $\ell_i=F_{i+1}-2$ where
$F_i$ is the $i$th Fibonacci number. See \cite[Thm. 5]{luca} and \cite{Fis}.
Hence $n$ must be a Fibonacci number.
Conversely, for any bispecial factor $u$ of $\mathbf{f}$, it is easy to
show that either ${\tt 0}u{\tt 0}$ or ${\tt 1}u{\tt 1}$ is not a factor
occurring in $\mathbf{f}$ (see for instance \cite[p. 47]{Lot}).
Therefore, amongst the four words ${\tt 0}u{\tt 0}$, ${\tt 1}u{\tt
1}$, ${\tt 0}u{\tt 1}$ and ${\tt 1}u{\tt 0}$, the last two must
occur in $\mathbf{f}$ and we get exactly two factors of length $|u|+2$
satisfying \eqref{eq:cond}. Indeed, assume that ${\tt 0}u{\tt 0}$ does
not occur in $\mathbf{f}$. Then for $u$ to be left (resp. right)
special, ${\tt 0}u{\tt 1}$ (resp. ${\tt 1}u{\tt 0}$) must occur in
$\mathbf{f}$.
\end{proof}
The reader may notice that the computations carried out in the proofs of the next two results could also be adapted to other Sturmian words.
\begin{theorem}\label{the:fib}
The set $\mathcal{APR}_{\mathbf{f}}$ of abelian returns to prefixes for the Fibonacci word $\mathbf{f}$ contains exactly the words ${\tt 0},\ {\tt 1},\ {\tt 01},\ {\tt 10},\ {\tt 001}$.
\end{theorem}
\begin{proof}
Using the same notation as in Theorem~\ref{thmSturmianImplF}, for
$c=7$, we have $D(7)\approx 0.145898$ which is such that
$2D(7)<\min\{1/\tau^2, 1-1/\tau^2\}\approx 0.381$. Hence, all abelian
returns to prefixes of the Fibonacci word have length at most $7$.
Actually, this value can be reduced:
\begin{lemma}\label{lemFibRedToThree}
There is no abelian return of length greater than $3$ to prefixes in the Fibonacci word.
\end{lemma}
\begin{proof}
With the notation of the proof of Theorem~\ref{thmSturmianImplF},
we set $\rho=\alpha = 1/\tau^2\approx 0.381$. Let $i$ be a natural number. Define the four points
$\rho_{i,t} = R^{i+t}_\alpha(\rho)$ for $t=0,1,2,3$.
Recall that, for all $k\ge 1$, the unit circle $[0,1)$ is split
into two half-intervals $I_H(k) = [1-\{k\alpha\},1)$ and
$I_L(k)=[0,1-\{k\alpha\})$ such that two factors $\mathbf{f}[i,i+k-1]$
and $\mathbf{f}[j,j+k-1]$ are abelian equivalent if and only if the
points $R^i_\alpha(\rho)$ and $R^j_\alpha(\rho)$ belong to the
same interval $I_H(k)$ or $I_L(k)$.
Let $I$ be any of the two intervals $I_H(k)$ or
$I_L(k)$. What we are going to prove is that if $\rho$ and
$\rho_{i,0}$ belong to $I$, then either $\rho_{i,2}$ or
$\rho_{i,3}$ belongs to $I$. In other words, if
$\mathbf{f}[i,i+k-1]\sim_{ab}\mathbf{f}[0,k-1]$, then either
$\mathbf{f}[i+2,i+k+1]$ or $\mathbf{f}[i+3,i+k+2]$ is abelian
equivalent to $\mathbf{f}[i,i+k-1]$ which gives the upper bound on
the length of any abelian return to a prefix of $\mathbf{f}$.
Note, that $\rho_{i,0}=R_\delta(\rho_{i,2})$, $\rho_{i,2}=R_{-\delta}(\rho_{i,0})$ and
$\rho_{i,3}=R_{\alpha-\delta}(\rho_{i,0})$, where
$\delta=1-2\alpha\approx 0.2361$. Assume that the factor of length $k$ starting in position $i$ is abelian equivalent to the prefix of $\mathbf{f}$ of length $k$, i.e., $\rho$ and $\rho_{i,0}$ are both light or heavy words. We consider two cases.
Suppose first that $\rho,\rho_{i,0}\in I_L(k)$. In this case, we have $[0,\rho] \subseteq I_L(k)$.
\begin{itemize}
\item If $\rho\le \rho_{i,0}<1$, then $[0,\rho_{i,0}] \subseteq I_L(k)$ and $0<\rho_{i,2}=R_{-\delta}(\rho_{i,0})<\rho_{i,0}$. Thus $\rho_{i,2}$ belongs also to $I_L(k)$.
\item If $\rho>\rho_{i,0}>0$, either $\rho_{i,0}\ge \delta$ and
then $\rho_{i,2}=R_{-\delta}(\rho_{i,0})\in[0,\rho)$ meaning
that $\rho_{i,2}\in I_L(k)$ or, $0<\rho_{i,0}< \delta$, i.e.,
$-\delta<\rho_{i,2}-1<0$ and then
$0<\alpha-\delta<\rho_{i,3}=R_{\alpha}(\rho_{i,2})<\rho$ meaning that
$\rho_{i,3}\in I_L(k)$.
\end{itemize}
Suppose now that $\rho,\rho_{i,0}\in I_H(k)$. In this case, as $\rho\in I_H(k)$, we have $[\rho,1)\subseteq I_H(k)$.
\begin{itemize}
\item If $\rho>\rho_{i,0}>0$, then $[\rho_{i,0},1)\subseteq I_H(k)$ and $\rho_{i,3}=R_{\alpha-\delta}(\rho_{i,0})$ belongs to $I_H(k)$.
\item If $\rho\le \rho_{i,0}<1$, either $\rho_{i,0}\ge \rho+\delta$ and
then $\rho_{i,2}=R_{-\delta}(\rho_{i,0})\in I_H(k)$ or, $\rho\le \rho_{i,0}< \rho+\delta$, i.e.,
$\rho-\delta\le\rho_{i,2}<\rho$ and then
$\rho<\rho-\delta+\alpha\le \rho_{i,3}=R_{\alpha}(\rho_{i,2})<\rho+\alpha<1$ meaning that
$\rho_{i,3}\in I_H(k)$.
\end{itemize}
\end{proof}
The factors of length at most $3$ occurring in $\mathbf{f}$ are
$\varepsilon$, ${\tt 0}$, ${\tt 1}$, ${\tt 00}$, ${\tt 01}$, ${\tt
10}$, ${\tt 001}$, ${\tt 010}$, ${\tt 100}$ and ${\tt 101}$.
Clearly, ${\tt 00}$, ${\tt 010}$ and ${\tt 101}$ do not satisfy
\eqref{eq:cond} and cannot be abelian returns. To conclude the proof,
we just have to show that ${\tt 100}$ is also forbidden.
\begin{lemma}
The set $\mathcal{APR}_{\mathbf{f}}$ of abelian returns to
prefixes for the Fibonacci word $\mathbf{f}$ does not contain
${\tt 100}$.
\end{lemma}
\begin{proof}
We continue with notation of Lemma~\ref{lemFibRedToThree}.
Suppose that ${\tt 100}\in \mathcal{APR}_{\mathbf{f}}$. There exists a prefix $p$ of $\mathbf{f}$ of length $k$ and a
position $i\ge 0$ such that
\begin{enumerate}
\item $\mathbf{f}[i,i+2]={\tt 100}$,
\item $\mathbf{f}[i,i+k-1]\sim_{ab}p$, i.e., $\rho$ and
$\rho_{i,0}$ belong to the same interval $I\in\{I_L(k),I_H(k)\}$,
\item for $t=1,2$, $\mathbf{f}[i+t,i+t+k-1]\not\sim_{ab}p$, i.e., $\rho_{i,1}$ and
$\rho_{i,2}$ do not belong to $I$,
\item $\mathbf{f}[i+3,i+2+k]\sim_{ab}p$.
\end{enumerate}
To get a contradiction, let us prove that either $\rho_{i,1}$ or
$\rho_{i,2}$ belongs to $I$. Since $\mathbf{f}_i={\tt 1}$, $\rho_{i,0}$ belongs to $I_1=[1-\alpha,1)$.
If $I=I_L(k)$, then we have $\rho_{i,1}\in [0,\rho)\subseteq I_L(k)$. If $I=I_H(k)$, then
we have $\rho_{i,2}\in [\rho,\rho+\alpha)\subseteq I_H(k)$.
\end{proof}
That concludes the proof of Theorem~\ref{the:fib}.
\end{proof}
\subsection{Link with abelian complexity}
The {\em abelian complexity} of an infinite word $\mathbf{x}$ is the
function $p^{ab}_\mathbf{x}:\mathbb{N}\to\mathbb{N}$ that maps $n\ge
0$ to the number of distinct abelian equivalence classes of
factors of length $n$ in $\mathbf{x}$. Let $C>0$. Recall that an
infinite word $\mathbf{x}\in A^\omega$ is {\em $C$-balanced}, if for
all $u,v\in\Fac(\mathbf{x})$ such that $|u|=|v|$, we have $\left|
|u|_a-|v|_a\right|\le C$ for all $a\in A$.
\begin{lemma}\cite{Rich}
An infinite word has bounded abelian complexity if and only if it is $C$-balanced for some $C>0$.
\end{lemma}
\begin{proposition} If $\mathbf{x}$ is an abelian recurrent word such that
$\mathcal{APR}_{\mathbf{x}}$ is finite, then $\mathbf{x}$ has bounded abelian complexity.
\end{proposition}
\begin{proof} Suppose $\mathbf{x}$ satisfies the assumptions of the
proposition but that $\mathbf{x}$ has unbounded abelian
complexity. From the previous lemma, we deduce that there exists a
symbol $a$ such that the maximum of differences $|u|_a-|v|_a$ for
factors $u,v$ in $\mathbf{x}$ having equal length can be
arbitrarily large.
Let $\delta>0$. There exist $u,v\in \Fac(\mathbf{x})$ of
equal length $n$ such that $|u|_a-|v|_a\ge\delta$. Let
$p=x_0x_1\ldots x_{n-1}$ be the prefix of length $n$ of
$\mathbf{x}$. Without loss of generality, we may assume that
$$||u|_a-|p|_a|\geqslant\frac{\delta}{2}.$$ Indeed, if $||u|_a-|p|_a|<\delta/2$ and $||v|_a-|p|_a|<\delta/2$, then one would deduce that $||u|_a-|v|_a|<\delta$.
As $\mathbf{x}$ is abelian recurrent, factors abelian equivalent to $p$ (resp. to $u$) occur infinitely often in $\mathbf{x}$. Therefore there exist $i