%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% LaTeX for a 7 page article. %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\documentstyle{article}
\setlength{\topmargin}{.25in}
\setlength{\textheight}{7.5in}
\setlength{\oddsidemargin}{.375in}
\setlength{\evensidemargin}{.375in}
\setlength{\textwidth}{5.75in}
%The following line will define a blackboard bold N if
%you don't have one.
\def\Bbb#1{\rm {I\kern -2pt #1}}
\pagestyle{myheadings}
\markright{\sc the electronic journal of combinatorics 2 (1995), \#R21 \hfill}
\thispagestyle{empty}
\newtheorem{theorem}{Theorem}[section]
\newtheorem{rem}[theorem]{Remark}
\newenvironment{remark}{\begin{rem}\rm}{\end{rem}}
\newtheorem{claim}[theorem]{Claim}
\newtheorem{lemma}[theorem]{Lemma}
\newtheorem{corollary}[theorem]{Corollary}
\newtheorem{eg}[theorem]{Example}
\newenvironment{example}{\begin{eg}\rm}{\end{eg}}
\newtheorem{conj}[theorem]{Conjecture}
\newenvironment{conjecture}{\begin{conj}\rm}{\end{conj}}
\newtheorem{prob}[theorem]{Problem}
\newenvironment{problem}{\begin{prob}\rm}{\end{prob}}
\newcommand{\pf}{\noindent {\bf Proof: }}
\begin{document}
\title{A note on antichains of words}
\author{James D. Currie\thanks{This work was supported by an NSERC
operating grant.}\\
Department of Mathematics and Statistics\\University of
Winnipeg\\Winnipeg, Manitoba\\Canada R3B 2E9\\currie@uwinnipeg.ca}
\date{Submitted: July 9, 1995; Accepted: October 14, 1995}
\maketitle
\begin{abstract}
We can compress the word `banana' as $xyyz$, where
$x =$ `b', $y = $ `an',$z = $ `a'. We say that `banana' {\it encounters\/}
$yy$. Thus a `coded' version of $yy$ shows up in `banana'. The
relation `$u$ encounters $w$' is transitive, and thus generates
an order on words. We study antichains under this order.
In particular we show that in this order
there is
an infinite antichain of binary
words
avoiding overlaps.
\end{abstract}
\medskip
{\bf AMS Subject Classification:} 68R15, 06A99\\
{\bf Key Words:} overlaps. antichains, words avoiding patterns\\
\section{Introduction}
The study of words avoiding patterns is an area of combinatorics
on words reaching back at least to the turn
of the century, when Thue proved \cite{thue} that one can find arbitrarily
long words over a 3 letter alphabet in which no two adjacent subwords
are identical. If $w$ is such a word, then $w$ cannot be written
$w = xyyz$ with $y$ a non-empty
word. In modern parlance, we would say that $w$ {\bf avoids} $yy$. A word
which can be written as $xyyz$ is said to {\bf encounter} $yy$.
Thue also showed that there are arbitrarily long words over a 2 letter
alphabet avoiding $yyy$. One can quickly check that no word of length 4 or
more over a 2 letter alphabet avoids $yy$. We say that $yyy$ is {\bf
avoidable} on 2 letters, or {\bf 2-avoidable}
whereas $yy$ is {\bf unavoidable} on 2 letters.
On the other hand, $yyy$ is certainly 3-avoidable, because any
word avoiding $yy$ must avoid $yyy$ also.
Bean, Ehrenfeucht and McNulty \cite{bean}, and
independently Zimin \cite{zimin}, characterized words which are avoidable
on some large enough finite alphabet. If $p$ is a word over an $n$ letter
alphabet, then $p$ is avoidable on some finite alphabet if $Z_n$ avoids
$p$, where words $Z_n$ are defined recursively by
\begin{eqnarray*}
Z_1&=&1\\
Z_{n+1}&=&Z_n(n+1)Z_n, n\in {\Bbb N}.
\end{eqnarray*}
Thus the pattern $abcacb$ is a pattern over a 3 letter alphabet, and is
avoided by $Z_3 = 1213121$. It follows that $abcacb$ is avoidable on some
large enough finite alphabet. The size of the smallest alphabet on
which $abcacb$ is avoidable isn't known. No avoidable pattern
is known which is not 4-avoidable. The following conjecture is
given by Baker \cite{LITP}:
\begin{conjecture}
Every avoidable pattern is 4-avoidable.
\end{conjecture}
The following problem has been open since 1979 \cite{bean}:
\begin{problem}
Find an algorithm which given a word $p$ determines the smallest
$k$ such that $p$ is $k$-avoidable.
\end{problem}
Cassaigne and Roth \cite{cassaigne,roth}
studied
avoidable binary and ternary patterns $p$, giving
when possible the smallest $k$ for which $p$ is $k$-avoidable.
In such
work, the most important patterns are the minimal ones; as discussed above,
$yyy$ is 3-avoidable because it contains $yy$. Similarly, if $w$
is 3-avoidable, then so is $w^R$, the reverse of $w$.
It follows from
the work of Cassaigne and Roth
that a binary pattern is 2-avoidable exactly when it encounters one of
$xxx$, $xyxyx$, $xyxxy$, $xxyxyy$, $xyxyyx$ and $xxyyx$.
Consideration of minimal $k$-avoidable patterns leads to
problems such as the following,
posed in \cite{baker}:
\begin{problem}
Write $u\ge w$ if $u$ encounters $w$ or the reversal of $w$. This relation
is a quasi-order, and factoring out the resulting equivalence relation gives
a partial order. Let $\mu(w)$ be the size
of the smallest alphabet on which $w$ is avoidable. For avoidable
$w$, is there an infinite antichain on $\mu(w)$ letters such that each
member of the antichain avoids $w$?
\end{problem}
Perhaps the posing of this problem is too ambitious. An
affirmative solution would imply that for any
avoidable word $w$, it takes no more letters to avoid
$w$ and $w^R$ simultaneously than it does to avoid $w$ alone. Strong
evidence to the contrary is provided by the example $w=abcacb$. It
seems likely that $abcacb$ is 2-avoidable; there are words of length 1000
over $\{0,1\}$
avoiding $abcacb$. However,
$abcacb$ and $bcacba$ are not simultaneously 2-avoidable.\footnote{Thanks
to Kirby Baker for the use of his software which allowed
the author to make these discoveries concerning $abcacb$.}
For this reason, it seems that a better question to ask is the following:
\begin{problem}
Write $u\ge w$ if $u$ encounters $w$.
For avoidable
$w$, is there an infinite antichain on $\mu(w)$ letters such that each
member of the antichain avoids $w$?
\end{problem}
This question was answered in the affirmative in the case where $w$
is 2-avoidable in \cite{goralcik}
Note that for the sake of studying antichains it is unnecessary to move
from the quasi-order to a partial order.
In a related paper \cite{crochemore} it was shown that
for any $\epsilon > 0$
there is an infinite antichain of such ternary words
avoiding $x^k$ for $7/4 < k <7/4+\epsilon.$ Note that
$7/4$ is the {\bf threshold} of repetition for words over a 3 letter
alphabet; if $r < 7/4$,
we can find at most finitely many words over a 3 letter alphabet
avoiding $x^r$.
The threshold of repetition for a 2 letter alphabet is 2. A word
containing no subwords of the form $x^k$ for $k>2$ is called
{\bf overlap-free}. A word is overlap-free exactly when it avoids
both the
patterns $xxx$ and $xyxyx$.
In this note we prove that
any binary word
which avoids overlaps is an element of an infinite antichain of binary
words
avoiding overlaps.
\section{Preliminaries}
An {\bf alphabet} $\Sigma$ is a set whose elements
are called {\bf letters}. A {\bf word} $w$
over $\Sigma$
is a finite string of letters from $\Sigma.$
The {\bf length} of word $w$ is the
number of letters
in $w$, denoted by $|w|.$ Thus $|banana|=6$, for example.
The language consisting of
all words over $\Sigma$ is denoted by $\Sigma^*.$
If $x,y \in \Sigma^*,$ the
{\bf concatenation} of $x$ and $y$, written $xy$, is simply the
string consisting
of $x$ followed by $y.$
The word with no letters is called the {\bf empty word} and is denoted by
$\epsilon.$
Suppose $w \in \Sigma^*.$ We call word $x$ a {\bf prefix} of $w$ if $w = xy,$ some
$y \in \Sigma^*.$
Similarly word $y$ is a {\bf suffix} of $w$ if we can write
$w = xy,$ some $x \in \Sigma^*.$
We call $y$ a {\bf subword} of $w$ if we can write $w = xyz,$ some $x,z \in
\Sigma^*.$ In the case that $x$ and $z$ are non-empty, $y$ is an
{\bf internal subword} of $w$.
Let $\Sigma$, $T$ be alphabets.
A {\bf substitution} $h: \Sigma^{*}\rightarrow T^{*}$
is a function generated
by its values on $\Sigma.$ That is, suppose $w\in \Sigma^{*}, \,
w = a_1 a_2\ldots a_m; \, a_i \in \Sigma$ for $i = 1$ to $m.$ Then
$h(w) = h(a_1)h(a_2)\ldots h(a_m).$
A substitution is {\bf non-erasing} if for every
$a\in \Sigma$, $|h(a)|\ne \epsilon$.
Let $w,v$ be finite words over some alphabet
$\Sigma$. We say that $w$ {\bf encounters} $v$ if $w=xh(v)y$ for some
non-erasing substitution $h:\Sigma^*\rightarrow\Sigma^*.$ Otherwise we say
that $w$ {\bf avoids} $v.$
If $x$ is a word we denote by $x^n$ the word consisting of $x$ repeated
$n$ times in a row. Thus $x^2=xx, x^3=xxx$ and so on. We call a word
$w$ a {\bf $k$-power} if $w = x^k,$ some $x \ne \epsilon.$ A 2-power is
also called a {\bf square}.
An {\bf overlap} is a word of form $xxx$ or $xyxyx$ for some words $x$
and $y$.
A word $w$ is {\bf $k$-power free} if
we cannot write $w = xyz$, where
$y$ is a $k$-power.
Thus $w$ is $k$-power free if $w$ avoids $x^k$.
Similarly one speaks of {\bf square-free} or {\bf overlap-free} words.
An {\bf $\omega$-word} over alphabet $\Sigma$ is
an infinite sequence of letters of $\Sigma.$ If
$w = \{w_i\}_{i \in \mbox{{\scriptsize {\Bbb N}}}}$
is an $\omega$-word over $\Sigma,$ then
each finite initial segment $w_1, w_2,\ldots,w_n$ of $w$ will
correspond to some word $w_1w_2\ldots w_n$ of $\Sigma^*.$
In this case we say that $w_1w_2\ldots w_n$ is a {\bf prefix
of $\omega$-word $w$.}
If $u$ is an $\omega$-word over $\Sigma$ we
say that $u$ encounters $w$ if some finite prefix of $u$ encounters $w.$
Otherwise, we say that $u$ avoids $w.$
We say that $w$ is {\bf $k$-avoidable} if the set of words over
$\Sigma$
avoiding $w$ is infinite, for some, hence for any,
alphabet $\Sigma$ of size
$k$. Equivalently, $w$ is $k$-avoidable if there is an $\omega$-word
over an alphabet of size $k$ which avoids $w$.
If $w$ is $k$-avoidable for some $k\in
{\Bbb N}$ we say that $w$ is {\bf avoidable}. Otherwise, $w$ is
{\bf unavoidable}. Let $S$ be a set of words.
We say that $v$ avoids
$S$ if $v$ avoids each $w\in S.$
Fix an alphabet $\Sigma$. The relation `$w$ encounters $v$' is a
quasi-order on $\Sigma^*$ which
we will abbreviate by $w\ge v.$
We will be interested in the quasi-ordered set $\langle\Sigma^*,\ge\rangle.$
\begin{lemma}
\label{infinite antichain}
Suppose that ${\cal A}\subseteq\Sigma^*$ is an infinite antichain.
Then there
is an $\omega$-word over $\Sigma$ avoiding ${\cal A}.$
\end{lemma}
\pf Let ${\cal A}=\{w_i\}_{i=1}^\infty.$
If $w$ is a non-empty word, denote by $w'$ the word obtained from $w$
by deleting the last letter.
We claim that for
each $i\in {\Bbb N},w_i'$ avoids ${\cal A}:$
If $v\le w_i'$ then $v\le w_i$ by transitivity. Thus if
$j\ne i,$ then $w_i'$ avoids $w_j,$ because $w_j$ and $w_i$ are
incomparable.
On the other hand, since $w_i'$ is shorter than
$w_i,$ certainly $w_i'$ avoids $w_i.$
Since ${\cal A}$ is an infinite set of words over a finite alphabet,
${\cal A}$ contains arbitrarily long words. Thus the set
${\cal A}'=\{w_i'\}_{i=1}^\infty$ contains arbitrarily long words of
$\Sigma^*$ avoiding ${\cal A}.$ It follows by K\"{o}nig's Infinity
Lemma that there is an $\omega$-word over $\Sigma$ avoiding
${\cal A}.\Box$
\begin{lemma}
Let $S$ be a finite set of avoidable words.
Then
there
is an $\omega$-word over a finite alphabet avoiding $S.$
\end{lemma}
\pf This is proved in \cite{bean,zimin}. Let $S = \{s_i:1\le i\le m\}.$
For each $i$ pick $n_i\in {\Bbb N}$ and an $\omega$-word
$w_i=\{w_{ij}\}_{j=1}^\infty$ over
an alphabet $\Sigma_i$ of size $n_i$ avoiding $s_i.$
Then the word
$w = \{(w_{1j},w_{2j},\ldots,w_{mj})\}_{j=1}^\infty$ over
the alphabet $\prod_{i=1}^m \Sigma_i$
avoids $s.\Box$
To avoid $S$ it suffices to avoid a maximal antichain of minimal
elements of $S.$ Thus we could get by with a version of this lemma in
which $S$ was restricted to be an antichain.
\begin{corollary}
Suppose that ${\cal A}\subseteq\Sigma^*$ is a finite antichain.
Then there
is an $\omega$-word over some finite alphabet avoiding ${\cal A}.$
\end{corollary}
\begin{remark}
It is striking that infinite antichains over finite sets are easier to
avoid than finite antichains! That is, to avoid a finite antichain over
$S$ it may be necessary to move to a larger alphabet, whereas this
is not the case with infinite antichains. To give a concrete example,
let $S$ be the set of all words of length 7 over $\Sigma = \{a, b\}.$
Each word of $S$ is 2-avoidable, but any binary word of length 7
or more encounters an element of $S$.
\end{remark}
An image of $xxx$ or $xyxyx$ under a non-erasing substitution is called an
{\bf overlap}. Note that a prefix of an overlap will be a square.
\begin{theorem}
There is
an infinite antichain of binary
words
avoiding overlaps.
\end{theorem}
\pf
Following Thue \cite{thue},
Define the map $h:\{a,b\}^*\rightarrow
\{a,b\}^*$ by $h(a) = ab, h(b) = ba.$
Let $l$ = aabaab. Thus $l^R=baabaa.$ We see that the prefixes of
$l$ which are suffixes of $l^R$ are exactly $a$, $aa$, $aabaa.$
\begin{lemma}
\label{no l in h}
Word $l$ is avoided by
$h^\omega(a).$
\end{lemma}
\pf This was proved by Cassaigne
\cite[Section 2.6, Th\'{e}or\`{e}me 2.2]{cassaigne}.
$\Box$
\begin{corollary}
The word $l^R$, the reverse of $l$, is avoided by
$h^\omega(a).$
\end{corollary}
Let $n\in{\Bbb N}$. Then we can write
$h^{2n+2}(a)=abbabaabu_nbaababba$ for some word $u_n.$
Let
$m_n = aabaabu_nbaabaa.$
\begin{remark}
\label{internal}
Word $l$ is a prefix and suffix of each
$m_n$, but every internal subword of $m_n$ is a subword of
$h^\omega(a).$ It follows that $l$ doesn't appear in $m_n$ internally.
We note that for each $n$, $m_n$ is a palindrome.
\end{remark}
\begin{lemma} Let $n\in{\Bbb N}$. Then the word
$m_n$ is overlap-free.
\end{lemma}
\pf Every internal
subword of $m_n$ is a subword of $h^\omega(a)$, and is overlap-free.
It remains to show that no prefix or suffix of $m_n$ is an overlap.
As $m_n$ is a palindrome, we need only show that no prefix of $m_n$
is an overlap.
First note that no prefix of $m_n$ is a square of length $12$ or
greater; otherwise the prefix of $m_n$ of length $6$ reappears
internally, i.e. $m_n$ contains an $l$ internally, which is
impossible. It follows that the shortest overlap which is a prefix of
$m_n$ has length at most $11,$ and is a prefix of $m_1.$ Inspection
shows that no prefix of $m_1$ of length $11$ or less is an overlap.$\Box$
\begin{theorem}The set $\{m_n\}_{n\in{\Bbb N}}$ is an antichain.
\end{theorem}
\pf Let $i,j\in{\Bbb N}, i