Abstract
We study the language-theoretic properties of the word problem, in the sense of Duncan & Gilman, of weakly compressible monoids, as defined by Adian & Oganesian. We show that if C is a reversal-closed super-, as defined by Greibach, then M has word problem in C if and only if its compressed left monoid L(M) has word problem in C. As a special case, we may take C to be the class of context-free or indexed languages. As a corollary, we find many new classes of monoids with decidable rational subset membership problem. Finally, we show that it is decidable whether a one-relation monoid containing a non-trivial idempotent has context-free word problem. This answers a generalization of a question first asked by Zhang in 1992.
2020 Mathematics Subject Classification:
1 Introduction
The word problem for finitely presented monoids was introduced in 1914 by Thue [Citation39]. Though this problem would eventually turn out to be undecidable in general, via the remarkable work by Markov [Citation31] and Post [Citation38], there are many tractable cases. Particularly the word problem for one-relation monoids has garnered a great deal of attention. Though it remains a tantalizing open problem whether this problem is decidable, much can be said (see especially the recent survey [Citation35]). Thue himself studied this problem, and solved the problem for some cases when the single defining relation is of the form w = 1. Such monoids are known as special one-relation monoids. Adian [Citation1, Citation2] studied special monoids in-depth, and proved that the word problem is decidable for all special one-relation monoids, via a reduction to the word problem for one-relator groups; this latter problem is decidable by Magnus’ famous theorem [Citation26]. Adian’s student Makanin [Citation28, Citation29, Citation33] later extended this to show that the word problem for any special k-relation monoid reduces to the word problem for some k-relator group. The author has recently proved similar language-theoretic reductions for special monoids [Citation36].
The work by Adian on special monoids would later be extended by Adian and his student Oganesian in the joint work [Citation3]. They defined a form of “compression,” which is called weak compression by the author in [Citation35]. A monoid M with defining relations ui = vi (for ) is called weakly compressible if there is some non-empty word α such that every ui and vi begins and ends with α. Associated to any weakly compressible monoid is a “compressed” left monoid L(M), in which the sum of the lengths of the defining relations are shorter than in the original monoid. The main idea is that the word problem for M can be reduced to the word problem for L(M). In particular, by combining this with Adian’s result, this solves the word problem in the class of subspecial one-relation monoids, being the class of monoids which can be compressed to a special one-relation monoid. Subspecial monoids were studied already by Lallement [Citation24], and are (essentially) the one-relation monoids containing a non-trivial idempotent.
Weak compression of subspecial monoids has received some further attention in the past few decades. Zhang [Citation41] extended the above results on the word problem to solve the conjugacy problem in certain subspecial monoids. Kobayashi [Citation22] used compression to prove that all one-relation monoids satisfy the topological finiteness condition , for which the subspecial case was the only outstanding case. Building on this, Gray & Steinberg showed that all one-relation monoids satisfy the homological finiteness property , at the same time studying the algebraic properties of subspecial monoids [Citation14].
In another direction, a highly successful approach to studying the word problem for groups came through the methods of formal language theory. Such methods were introduced to group theory by An i¯s i¯mov [Citation5], and focus on the language-theoretic properties of the set of all words representing the identity element, calling this set the word problem of the group (its properties are generally independent of the finite generating set chosen). An i¯s i¯mov showed that the word problem of a group is a regular language if and only if the group is finite, and that the class of groups with context-free word problem is closed under free products. A little over ten years later, Muller & Schupp (supplemented by [Citation12]) gave a remarkable algebraic charactersiation, generally referred to as the Muller–Schupp theorem: a finitely generated group has context-free word problem if and only if it is virtually free [Citation32].
The language of words representing the identity element in a monoid is generally not very enlightening (though for special monoids it is [Citation36], as in the case of groups). To remedy this, Duncan & Gilman introduced a new language which encodes equality in a monoid [Citation11]. This set, which is also termed the word problem for the monoid, has many of the attractive properties of the analogous set for groups, such as invariance under generating set chosen. Furthermore, it is easy to show that a monoid has regular word problem if and only if it is finite, mimicking An i¯s i¯mov’s result. The word problem also generally behaves well under taking free products [Citation9, Citation37]. This notion of the word problem has also been studied in e.g. [Citation17, Citation18].
This paper seeks to study the language-theoretic properties of Adian & Oganesian’s weak compression. We will paint a fairly complete picture. The main theorem of this article (Theorem 1.1) will involve classes of languages C which are super- s, as introduced by Greibach [Citation15]. Such classes generalize the context-free languages and the indexed languages; roughly speaking, a full is a super- if it is closed under “recursion” (see Section 2.2 for details). We will prove the following main theorem:
Theorem 1.1.
Let M be a weakly compressible monoid, and let L(M) be its left monoid. Let C be a reversal-closed super- . Then M has word problem in C if and only if L(M) has word problem in C.
An overview of the article is as follows. In Section 2 we give some notation, useful concepts, and define two language-theoretic operations from [Citation37]. In Section 3, we define weak compression, amalgamating several authors’ definitions and notations. In Section 4, we prove Theorem 1.1. Finally, in Section 5 we give some corollaries of Theorem 1.1, particularly focused on context-free monoids. In particular, we will obtain many monoids for which the rational subset membership problem is decidable (Corollary 5.1); and that it is decidable whether a one-relation monoid containing a non-trivial idempotent has context-free word problem (Theorem 5.4).
This article is a condensed form of Chapter 4 of the author’s PhD thesis [Citation34].
2 Notation and auxiliary results
We assume the reader is familiar with the fundamentals of formal language theory. In particular, a full (abstract family of languages) is a class of languages closed under homomorphism, inverse homomorphism, intersection with regular languages, union, concatenation, and the Kleene star. For some background on this, and other topics in formal language theory, we refer the reader to standard books on the subject [Citation6, Citation16, Citation19]. The paper also assumes familiarity with the basics of the theory of semigroup, monoid, and group presentations, which will be written as , and , respectively. For further background see e.g. [Citation2, Citation10, Citation25, Citation27]. We also refer the reader to [Citation36, Citation37].
2.1 Monoids, words, rewriting
Let A be a finite alphabet, and let denote the free monoid on A, with identity element denoted ε or 1, depending on the context. Let A+ denote the free semigroup on A, i.e. . For , by we mean that u and v are the same word. For , we let denote the length of w, i.e. the number of letters in w. We have . If for , then we let denote the reverse of w, i.e. the word . For a language L, we define as the collection of all words where . We say that a class of languages C is reversal-closed if for every , we have . We say that the word is self-overlap free if it is empty, or else if it is non-empty and none of the proper non-empty prefixes of w is also a proper non-empty suffix of w. Thus xyxyy is self-overlap free, but xyxyx is not. If the words are equal in the monoid , then we denote this . Finally, when we say that a monoid M is generated by a finite set A, we mean that there exists a surjective homomorphism . In this case, will be used synonymously with .
We give some notation for rewriting systems. For an in-depth treatment and further explanations of the terminology, see e.g. [Citation8, Citation21]. A rewriting system on A is a subset of . An element of is called a rule. The system induces several relations on . We will write if there exist and a rule such that and . We let denote the reflexive and transitive closure of . We denote by the symmetric, reflexive, and transitive closure of . The relation defines the least congruence on containing . For , we let denote the set of ancestors of X, i.e. . The monoid is identified with the quotient . For a rewriting system and a monoid , we say that is M-invariant if for every rule , we have . That is, is M-invariant if and only if .
A rewriting system is said to be monadic if implies and . We say that is special if implies . Every special system is monadic. Let C be a class of languages. A monadic rewriting system is said to be C if for every , the language is in C. Thus, we may speak of e.g. monadic C-rewriting systems or monadic context-free rewriting systems.
Let M be a monoid with a finite generating set A. One way of encoding the structure of M as a language is by using the identity problem of M (with respect to A), defined as(2.1) (2.1)
When M is a group, this is the standard definition of the language-theoretic word problem, appearing e.g. in Muller & Schupp [Citation32] and Anisimov [Citation5]. However, for general monoids, (2.1) is generally not very useful. Instead, the central definition of this article will be the following. The (monoid) word problem of M (with respect to A) is defined as the language(2.2) (2.2) where is some fixed symbol not in A. For a class of languages C, we say that M has C-word problem if is in C. If C is closed under inverse homomorphism, then M having C-word problem does not depend on the finite generating set A chosen for M (see e.g. [Citation17]). The above definition (2.2) is due to Duncan and Gilman [Citation11, p. 522]. We will throughout this article use as a short-hand for the set .
2.2 Super- s
The theorems in this paper involve a special type of classes of languages, called super- s. These were introduced by Greibach [Citation15], using nested iterated substitutions. In [Citation37], the author gave an equivalent characterization of super- s, which we shall use here. We begin with a simple definition. Recall that for a language L and a rewriting system , the language denotes the language of ancestors of L under , i.e. the set of words which can be rewritten to some word in L.
Definition 1.
Let C be a class of languages. Let be a rewriting system. Then we say that is C-ancestry preserving if for every with , we have . If every monadic C-rewriting system is C-ancestry preserving, then we say that C has the monadic ancestor property.
Example 1.
If is a monadic context-free rewriting system, and is a context-free language, then is a context-free language [Citation7, Theorem 2.2]. That is, if we let CF denote the class of context-free languages, then every monadic CF-rewriting system is CF-ancestry preserving; in other words, the class of context-free languages has the monadic ancestor property.
Definition 2.
Let C be a full . Then C is said to be a super- if C has the monadic ancestor property.
For example, the class CF of context-free languages forms a super- by [Citation23, Theorem 1.2], and the class IND of indexed languages is also a super- [Citation4, Citation13]. Both are closed under reversal. On the other hand, neither the classes REG nor DCF of regular, resp. deterministic context-free, languages are super- s. Indeed, if C is a super-, then , by [Citation15, Theorem 2.2]. For more examples and generalizations, we refer the reader to the so-called hyper- s defined by Engelfriet [Citation13], all of which are super- s.
We remark that Greibach [Citation15] originally defines super-s via “nested iterated substitutions”, which behave similarly to taking ancestors under monadic rewriting systems. The author [Citation37, Proposition 2.2] has proved that the above definition, using rewriting systems, is equivalent to Greibach’s, and so we will use her results about super-s without restriction.
2.3 Alternating products and bipartisan ancestry
With general definitions taken care of, we will now turn to give some useful auxiliary language-theoretic results.
We first define an operation (the alternating product) on certain languages, which mimics the operation of the free product of semigroups. This operation appears in [Citation37]. Fix an alphabet A and let be a symbol not in A. Let be any language. We say that L is concatenation-closed (with respect to ) if(2.3) (2.3) where . The word problem of any finitely generated monoid is always a concatenation-closed language.
Let be a parametrisation such that either and , or else and , for all . A parametrisation X of this form will be called standard. Given two concatenation-closed languages in , we now present an operation for combining them. Let be concatenation-closed. Then the alternating product of L1 and L2 is defined as the language consisting of all words of the form:(2.4) (2.4) where for all we have , where X is a standard parametrisation. In particular, if and only if or . Alternating products are modeled on the (semigroup) free product, as the following example shows.
Example 2.
Let and . Then, of course, and . Now both languages L1 and L2 are concatenation-closed, and it is easy to see that we have
Thus .
Using the monadic ancestor property, one can prove the following useful statement.
Proposition 2.1.
[Citation37, Proposition 2.3] Let C be a super- . Let be concatenation-closed languages. Then .
We will require another operation with useful preservation properties, also introduced in [Citation37]. Let be two rewriting systems. Let . Then we define the bipartisan -ancestor of L as the language(2.5) (2.5)
Bipartisan ancestors can, informally speaking, manipulate the left and the right side (of in words in L) using resp. , and these manipulations can be independent of one another. We give an example of this independence.
Example 3.
Let , and let . Let be the rewriting system with the rules for all . Then
In particular, we have even if .
Proposition 2.2.
[Citation37, Proposition 2.5] Let C be a class of languages closed under rational transductions. Let , and let be rewriting systems. If are C-ancestry preserving, then .
Bipartisan ancestors are useful in describing the language theory of monoid (and group) free products. See [Citation37] for further details. Alternating products and bipartisan ancestors are the language-theoretic tools we shall require for this article. We now turn to the main subject of the article.
3 Weak compression
We present weak compression as it is defined in the survey [Citation35, Section 3.1] by the author. This is an amalgamation of other approaches [Citation3, Citation14, Citation22, Citation24, Citation41].
Let A be an alphabet. We say that a pair (u, v) of words is sealed by if . If a pair is sealed by some word, then it is clearly sealed by a unique self-overlap free word α. For example, (xyxpxyx, xyxqxyx) is sealed by xyx, but also by the self-overlap free word x.
Let M be the monoid defined by the presentation(3.1) (3.1)
Definition 3.
We say that the monoid defined by (3.1) is weakly compressible (with respect to α) if there is some self-overlap free word such that for all , the pair (ui , vi ) is sealed by α.
If M is not weakly compressible, then we say that M is incompressible. Any special monoid is incompressible by default (in particular groups are incompressible).
Let M be a weakly compressible monoid defined by (3.1). We will assume , for otherwise M is a finite cyclic monoid, and all is trivial. A word is called a left α-conjugator if . Let be the set of left α-conjugators which contain exactly one occurrence of α, i.e. . For ease of notation, we write instead of . Clearly, is a countably infinite suffix code (as ), and is the set of all left α-conjugators. Enumerate the words of as and fix a set of symbols in bijective correspondence with via the map . As is a suffix code, we can extend to an isomorphism of the free monoids and .
Every defining relation ui = vi in (3.1) is sealed by α, and as α is self-overlap free, we have . We factor ui , vi uniquely over the suffix code , yielding(3.2) (3.2)
Any word or appearing in the factorizations (3.2) for some is called a left piece of M. The set of all left pieces of M is denoted , which will be shortened to Σ. As M is finitely presented, Σ is a finite set of words. We let be the set of symbols from .
From the factorization (3.2), we define a new presentation(3.3) (3.3)
Definition 4.
The monoid defined by the presentation (3.3) is called the extended left monoid associated to the monoid M (defined by (3.1)), and is denoted . The submonoid of generated by Γ is the left monoid associated to M, and is denoted L(M).
It is obvious from (3.3) that , where * denotes the monoid free product, and is a free monoid of countably infinite rank, freely generated by . We emphasize that it is always decidable to compute the presentation (3.3) starting from the weakly compressible (3.1), as it only requires computing with regular languages. Of course, L(M) is finitely presented, with the same defining relations as in (3.3). Furthermore, the sum of the lengths of the defining relations in L(M) is strictly shorter than the corresponding sum for M. In particular, setting and for i > 1, there exists some such that is incompressible.Footnote1
We give an example. One can easily verify that if(3.4) (3.4) then the defining relation of M1 is sealed by xy. Factorizing both sides of the defining relation, we find and hence . Thus(3.5) (3.5)
This (special) monoid is incompressible. It is isomorphic to , which is isomorphic to the infinite cyclic group by removing the redundant generator b.
We return to the general case, and present the normal form results from [Citation3, Citation24]. First, note that it is obvious that if two words are equal in M, then u contains an occurrence of the self-overlap free word α if and only if v does; and furthermore, if neither u nor v contain α, then if and only if . Second, given any , there exist unique and such that . We call such a factorization the canonical form of u, and is called the α-part of u. Write , where . The word is called the γ-part of u. The following theorem is fundamental to weakly compressible monoids.
Theorem 3.1
(Adian & Oganesian, Lallement). Let M be a weakly compressible monoid defined by (3.1). Let have canonical forms and , respectively. Let be the γ-parts of u and v, respectively. Then if and only if (1) and ; and (2) in . Furthermore, (2) is equivalent to (2’) .
Remark 1.
The above result appears as [Citation24, Lemma 3.2] and [Citation3, Theorem 3.2] only in the case of one defining relation. However, the proofs of these results use no properties of one-relation monoids, and can, mutatis mutandis, also be used to prove the above result.
In particular, the word problem for M is decidable if and only if the word problem for L(M) is decidable. Furthermore, one can without much difficulty show that the left (right) divisibility problem for M reduces to the word and left (right) divisibility problems in L(M). In particular this solves the word and divisibility problems in the monoid M1 defined by (3.4).
4 Proof of Theorem 1.1
Let M be a weakly compressible (with respect to some self-overlap free word α) monoid defined by (3.1), and let , and Γ be as in Section 3. We will now prove that the language-theoretic properties of the monoid M defined by (3.1) can be reduced to the properties of the left monoid L(M). Let C be a reversal-closed super-, which will remain fixed throughout this section.
To simplify some technical notation we shall sometimes consider Σ, rather than Γ, as a finite generating set for the monoid L(M), as there exists a surjective homomorphism from onto L(M) given by the composition , where is the natural surjective homomorphism. However, it is important to notice that if , then generally is very distinct from , unlike what Theorem 3.1 may seem to suggest. Instead, using the canonical forms we find that we have(4.1) (4.1)
By the first remark preceding Theorem 3.1, it follows that the language is a union of the two disjoint languages(4.2) (4.2) (4.3) (4.3)
The language defined by (4.3) is the intersection of the context-free language with the regular language . In particular, is a context-free language. Any super- contains the class of context-free languages [Citation15, Theorem 2.2]. It follows that as C is a super-, we have that implies , as C is closed under union.
Having reduced the language-theoretic properties of to those of , we perform another reduction, which will yield Lemma 4.1. Let(4.4) (4.4)
Of course, we have the strict inclusions
Using this new language, we define the rewriting system(4.5) (4.5)
This is a monadic rewriting system. As C is a super-, if we now have , then it follows that is a C-rewriting system.
Lemma 4.1.
If , then .
Proof.
Suppose . By the remarks following (4.2) and (4.3), it suffices to show that . We claim that(4.6) (4.6)
This suffices to establish that , as C is closed under intersection with regular languages; is a context-free language and hence in C; and C has the monadic ancestor property.
Note first that for every rule we have . Hence, if , then it is clear by induction on the number of -rewritings necessary to transform w into an element of that . Hence the inclusion in (4.6) is proved.
Second, if , then for some with . Let and be the canonical forms of u and v, respectively. By Theorem 3.1, we have , and . Hence
Furthermore, . Thus from (4.5) we find
It follows that
Thus . As w was arbitrary, and as both u and v contain α, we have proved the inclusion in (4.6). This establishes the desired equality (4.6). □
Thus, to prove (the hard direction of) the main theorem, it suffices to show that the properties of reduce to the properties of L(M). This is non-trivial; a reduction to the properties of is suggested by Theorem 3.1, but is not a finitely generated monoid, being a free product of L(M) by an infinite rank free monoid . However, the word problem of is, informally speaking, essentially the context-free language . We will show that is, up to technical details, describable as an alternating product of the word problem of L(M) by this language , together with an appropriate amount of ancestry.Footnote2 By the preservation results for alternating products and ancestry, this yields the proof strategy for reducing to L(M).
Lemma 4.2.
If L(M) has word problem in C, then .
Before giving the proof of this key lemma, we need some setup and auxiliary lemmas. We introduce some notation for convenience. Let , and let . Obviously . Let for i = 1, 2. As Σi is a suffix code, it follows that , for i = 1, 2. For further ease of notation, we write , and . Then , where * denotes the monoid free product. In this new notation, it follows directly from Theorem 3.1 that if , then(4.7) (4.7) as in this case , and all defining relations of the presentation (3.3) of are pairs of words over the alphabet . Analogously, if , then(4.8) (4.8)
We define two rewriting systems on resp. . Let(4.9) (4.9) (4.10) (4.10)
Let now be any word, factorized uniquely as with for all and , where X is a standard parametrisation. We say that (this factorization of) u is reduced if or ; or if for all . Obviously, any irreducible descendant of u modulo is reduced, and any reduced word is clearly irreducible modulo . Furthermore, as is M-invariant, if , then , so we conclude that every word is equal in M to some reduced word (though this is generally not unique). Given any reduced , we can uniquely factorize it as a reduced factorization , where for , where Y is a standard parametrisation. This factorization of is called the normal form of the reduced word .
Lemma 4.3.
Let . Let , resp. , be any reduced forms of u, resp. v, with normal forms and respectively. Then if and only if (1) n = m, and (2) and for all , for some standard parametrisation X.
Proof.
The “if” direction follows by induction on n, and the simple observation that if , then as and begin with α we have
For the “only if” direction, by Theorem 3.1, it follows that if and only if . Now . As if and only , it follows from the fact that is reduced that is reduced with respect to the monoid free product , i.e. no non-empty subword of equals 1 in . The analogous statement is true for . Hence, as , it follows from the usual normal form lemma for monoid free products (see [Citation20, Section 8.2] or [Citation37, Section 1]) that (1) n = m; and that (2) and for , where X is some standard parametrisation. As if and only , the result follows. □
The rewriting systems and defined in (4.9) and (4.10) are close to being monadic. We show that, via appropriate rational transductions, we can extend the monadic ancestor property to also apply to these rewriting systems, using the self-overlap free property of α in a non-trivial way.
Lemma 4.4.
If L(M) has word problem in C, then and are C-ancestry preserving.
Proof.
Let be an arbitrary language with . To prove that is C-ancestry preserving, it suffices to show that . Let be the set of left-hand sides of rules in . We claim . Indeed, is such that if and only if by (4.7). Recalling the definition of from (2.1), it follows that(4.11) (4.11)
As L(M) has word problem in C, it follows that , by an easy transduction (cf. [Citation37, Lemma 3.5]). As C is a full and is a homomorphism, it hence follows from (4.11) that .
Let now be a new symbol. Let , and define the homomorphism by for all , and . Define a new rewriting system
Clearly is a monadic rewriting system. The language of left-hand sides of in is the intersection , and as it follows from the closure of C under rational transduction that is a C-rewriting system.
Now, as α is self-overlap free, is a suffix code, so for any word we can uniquely factor u as , where for . Hence there is exactly one word with the properties that (1) does not contain α; and (2) . The uniqueness of this word depends on the fact that α is self-overlap free.Footnote3 Of course, for any , there is also a unique such , namely . Let be the language . Then . Furthermore, by the above uniqueness argument, we have(4.12) (4.12)
Hence, from we conclude .
We now claim that . The right-hand side is the image under the homomorphism of the ancestor of under the monadic C-rewriting system . As C is a super-, the right-hand side is in C, and thus we would conclude that is C-ancestry preserving. The desired equality is easy to prove. Indeed, it follows directly from the fact that if and , then if and only if . This latter fact is proved by an easy induction on the number of rules applied, and we omit this proof. We conclude that is C-ancestry preserving.
The case of is symmetric, bearing in mind that (1) C is reversal-closed; (2) is self-overlap free if and only if α is self-overlap free, and (3) is a prefix code, rather than a suffix code (factorization over is still unique). We omit the details. □
One more step is needed before proving Lemma 4.2. Let , where the abbreviates “palindrome.” By (4.8), we have(4.13) (4.13)
Clearly is a context-free language, as is regular, so . Let be defined by for all , and . We define the (rather complicated-looking) language(4.14) (4.14)
We shall see in the below proof of Lemma 4.2 that . We first prove that is a language encoding the language-theoretic properties of L(M).
Lemma 4.5.
If L(M) has word problem in C, then .
Proof.
If L(M) has word problem in C, we have (cf. also the remark preceding (4.1)). This is a concatenation-closed language, as it is a word problem. Furthermore, is a context-free language and is easily checked to be concatenation-closed. Hence by Proposition 2.1, the alternating product is in C, as is its image under the homomorphism . By Lemma 4.4, and are C-ancestry preserving, so by Proposition 2.2, we have . □
Proof of Lemma 4.2.
It suffices, by Lemma 4.5, to prove that . We prove the inclusions one at a time.
. Let be arbitrary. Then we can write with and . Let be reduced forms of u resp. v such that and . As is M-invariant, we have and . Then by Lemma 4.3 we can writewith n = m, and for all , for some standard parametrisation X. Let X be such a parametrisation. Let(4.15) (4.15)
Now for every , we have if and only if . When X(i) = 1, then by (4.1) and (4.7) it follows that , so . When X(i) = 2, then by (4.8) we have , so . It follows that the right-hand side of (4.15) is an element of the alternating product of by , and hence so too is . Let , i.e. we just proved . Let(4.16) (4.16)
Then . We have , so of course . As , we have and so also . Hence, by the definition of -ancestors, we have(4.17) (4.17)
But is just w; and the right-hand side of (4.17) is , so we are done.
This argument is similar to the direction , and differs only in that it uses the M-invariance of . The proof is therefore only sketched here (but is detailed in the author’s Ph.D. thesis [Citation34, Lemma 4.2.13]).
Consider any word . Two things must be proved; first, that (1) ; and second, that (2) . For (1), it suffices to note that any word in is of the form with . As is M-invariant, any ancestor of will be equal to U in M; hence, in particular, we will also have by Theorem 3.1. The analogous argument for V implies that (1) holds.
For (2), since , there is some such thatas . By the M-invariance of , we have , and analogously also . Thus to show that satisfies (2), it suffices to show . But that this is the case can be easily seen by writing out as an element of the alternating product . □
By combining Lemmas 4.1 and 4.2, we conclude that if L(M) has word problem in C, then M has word problem in C. This is the difficult direction of the proof of Theorem 1.1. It remains to prove the converse:
Lemma 4.6.
If M has word problem in C, then L(M) has word problem in C.
Proof.
Indeed, as before let the homomorphism be defined by for all , and . Then it is easy to check, using (4.1), that(4.18) (4.18)
Now is an injective homomorphism, as is a code, implying that any word in the image of can be decoded by finding all the occurrences of , which must be surrounded by non-overlapping occurrences of α, and the remaining letters are then decoded as themselves. In particular, for any language L. Hence, applying to both sides in (4.18), we find that is a rational transduction of . As C is closed under rational transductions, it follows that L(M) has word problem in C. □
This completes the proof of Theorem 1.1.
5 Corollaries of Theorem 1.1
In this section, we present some corollaries of Theorem 1.1. In particular, we will show that it is decidable whether a one-relation monoid containing a non-trivial idempotent has context-free word problem (Theorem 5.4). This answers a generalisation of a question first asked by Zhang in 1992, which was answered affirmatively by the author in [Citation36]. We begin by applying the result to the rational subset membership problem for monoids.
5.1 Rational subset membership problem
Throughout this section we will fix a weakly compressible monoid M, defined by the presentation (3.1), which is compressible with respect to α. As the class of context-free (resp. indexed) languages is a super- closed under reversal, it of course follows from Theorem 1.1 that: M has context-free (resp. indexed) word problem if and only if L(M) does.
One of the direct consequences for a monoid having context-free word problem is decidability of its rational subset membership problem. Recall that for a finitely generated monoid M, generated by a finite set A and with associated surjective homomorphism , this decision problem asks: given a regular language and a word , can one decide whether ? The rational subset membership problem clearly specializes to the submonoid membership problem, the divisibility problems, and the word problem for M. It is easy to check that any context-free monoid has decidable rational subset membership problem; indeed, if if and only if w is equal to some word in R, i.e. if and only if , where denotes the right quotient. As is a regular language, and is a context-free language, the quotient is a context-free language; and membership in context-free languages is well-known to be (uniformly) decidable (cf. also [Citation36, Theorem 3.5]). We conclude:
Corollary 5.1.
Let M be a weakly compressible monoid. If L(M) has context-free word problem, then the rational subset membership problem for M is decidable.
Example 4.
Let be natural numbers such that n > 1, , and for all . Let be the monoid defined by
Then as the rewriting system with the single rule is complete, monadic, context-free, and defines Π, it follows that Π has context-free word problem [Citation7, Corollary 3.8]. Hence, by Theorem 1.1 any monoid which compresses to Π has context-free word problem. Let now A be a new alphabet, let be self-overlap free, and let for be pairwise distinct words. Let τ be the homomorphism defined by mapping for all . Let be the monoid defined by(5.1) (5.1)
Then one readily sees that . Hence has context-free word problem, for any choice of and wi (). As a concrete example, if and , then the monoid defined by(5.2) (5.2) has context-free word problem, as it compresses to the monoid defined by
Verifying directly that (5.2) has context-free word problem appears to be somewhat tedious.
We conclude that the monoid (5.2) has decidable rational subset membership problem, and more generally so too does any monoid of the form (5.1).
5.2 Subspecial one-relation monoids
We turn to the case when M in (3.1) is defined by a single relation u = v. Assume without loss of generality that . We say that M is subspecial if . Any special monoid (i.e. when the defining relation is u = 1) is obviously subspecial. An element is called an idempotent if . Of course, the identity element 1 is always a (trivial) idempotent. Associated to any idempotent e is a maximal subgroup, being the set of elements which are invertible with respect to this idempotent (alternatively, the group -class of e). Lallement [Citation24] proved that a one-relation monoid M contains a non-trivial idempotent if and only if (1) M is special, and not right cancellative; (2) and M is subspecial. By using Adian’s overlap algorithm (see [Citation36]), it is decidable whether a one-relation special monoid is right cancellative. Hence it is decidable whether a one-relation monoid M contains a non-trivial idempotent.
As M is subspecial, it is clearly weakly compressible. Furthermore, it is not hard to see that L(M) is also subspecial (see e.g. [Citation22, Lemma 5.4]). Thus to any subspecial monoid M we can associate a special monoid , obtained by iterating weak compression until we arrive at a special monoid. Hence M has context-free word problem if and only if does, by Theorem 1.1 (of course, the same is also true for any reversal-closed super-). Using the structural results about maximal subgroups of subspecial monoids by Gray & Steinberg [Citation14], we can prove the following extension of the Muller–Schupp theorem:
Theorem 5.2.
Let M be a subspecial (one-relation) monoid. Then M has context-free word problem if and only if all of its maximal subgroups are virtually free.
Proof.
First, assume M is special. Then, by a result due to Malheiro [Citation30, Theorem 4.6], all maximal subgroups of M are isomorphic to the group of units U(M) of M. The main theorem of [Citation36] states that M has context-free word problem if and only if U(M) is virtually free. The result thus follows in this case.
Suppose that M is subspecial, but not special. By [Citation14, Lemma 5.2], the maximal subgroups of M are all isomorphic to the group of units of , with the exception of the group of units of M, which is trivial (and hence virtually free). Let G be one of the non-trivial maximal subgroups of M, so that . Now is a finitely generated one-relation special monoid. Hence, by classical results (cf. [Citation36]), G is a finitely generated one-relator group.
Assume now that M has context-free word problem. Then, as G is a finitely generated subsemigroup of M, it follows from [Citation17, Proposition 8(b)] that G has context-free word problem. Hence, by the Muller–Schupp theorem, G is virtually free, as desired. For the converse, assume G is virtually free. By the easy direction of the Muller–Schupp theorem, G has context-free word problem. Then, by the main theorem of [Citation36], has context-free word problem. Hence, by the main theorem of the present paper, M has context-free word problem. □
This gives a complete algebraic characterization of the subspecial monoids with context-free word problem, extending the Muller–Schupp theorem to this class.
Corollary 5.3.
Let M be a subspecial one-relation monoid such that all of its maximal subgroups are virtually free. Then M has decidable rational subset membership problem.
For example, one can check that the subspecial monoid(5.3) (5.3) compresses (in a single step) to . Now it follows from [Citation2] that the group of units of is isomorphic to , which is isomorphic to by removing the redundant generator q. Hence the non-trivial maximal subgroups of M3 are all infinite cyclic, and in particular virtually free. Hence the monoid M3 defined by (5.3) has context-free word problem by Theorem 5.2, and it has decidable rational subset membership problem by Corollary 5.3.
We end with a final corollary of Theorem 5.2, the statement of which does not use compression or subspeciality.
Theorem 5.4.
There is an algorithm which takes as input a one-relation monoid containing a non-trivial idempotent, and decides whether M has context-free word problem.
Proof.
Suppose we are given a one-relation monoid presentation for M. Assume without loss of generality that . Then as M contains a non-trivial idempotent, either M is special, or else M is subspecial with . In the first case, the defining relation is u = 1, and by [Citation36, Theorem B] we can hence decide (uniformly in u) whether this special one-relation monoid has context-free word problem.
In the latter case, we repeatedly compress the relation u = v until we find(5.4) (5.4)
By Theorem 1.1, M has context-free word problem if and only if the special one-relation monoid (5.4) has context-free word problem; this latter problem is (uniformly) decidable by another application of [Citation36, Theorem B]. □
In 1992, Zhang [Citation40, Problem 3] asked if it is decidable whether a special one-relation monoid has context-free word problem. This was recently answered affirmatively by the author [Citation36, Theorem B]. The above Theorem 5.4 thus extends this answer from special to subspecial one-relation monoids.
Acknowledgments
The research in this article was carried out while the author was a Ph.D. student at the University of East Anglia. The author wishes to thank his supervisor Dr Robert D. Gray for many useful comments and discussions, and the anonymous referee for very thorough and careful comments, all of which significantly improved the article.
Additional information
Funding
Notes
1 The compression defined by Lallement [24] transforms M into in a single step, whereas that by Adian & Oganesian [3] or indeed Kobayashi [22] corresponds to transforming M into L(M).
2 The alternating products and ancestry defined and used by the author in [37] were, in fact, first developed by the author to deal with precisely the problem of describing the word problem of .
3 For example, if we take the word , which has self-overlaps, then .
References
- Adian, S. I. (1960). The problem of identity in associative systems of a special form. Soviet Math. Dokl. 1:1360–1363.
- Adian, S. I. (1966). Defining relations and algorithmic problems for groups and semigroups. Proceedings of the Steklov Institute of Mathematics, No. 85.
- Adian, S. I., Oganesian, G. U. (1978). On problems of equality and divisibility in semigroups with a single defining relation. Math. USSR, Izv. 12:207–212. [Izv. Akad. Nauk SSSR Ser. Mat. 42(2)].
- Aho, A. V. (1968). Indexed grammars—an extension of context-free grammars. J. Assoc. Comput. Mach. 15: 647–671. DOI: 10.1145/321479.321488.
- An i¯s i¯mov, A. V. (1971). The group languages. Kibernetika (Kiev) 4:18–24.
- Berstel, J., Perrin, D. (1985). Theory of Codes, volume 117 of Pure and Applied Mathematics. Orlando, FL: Academic Press, Inc.
- Book, R. V., Jantzen, M., Wrathall, C. (1982). Monadic Thue systems. Theoret. Comput. Sci. 19(3):231–251. DOI: 10.1016/0304-3975(82)90036-6.
- Book, R. V., Otto, F. (1993). String-Rewriting Systems. Texts and Monographs in Computer Science. New York: Springer-Verlag.
- Brough, T., Cain, A. J., Pfeiffer, M. (2019). Context-free word problem semigroups. In: Hofman, P., Skrzypczak, M., eds. Developments in Language Theory, volume 11647 of Lecture Notes in Computer Science. Cham: Springer, pp. 292–305.
- Campbell, C. M., Robertson, E. F., Ruškuc, N., Thomas, R. M. (1995). Semigroup and group presentations. Bull. London Math. Soc. 27(1):46–50. DOI: 10.1112/blms/27.1.46.
- Duncan, A., Gilman, R. H. (2004). Word hyperbolic semigroups. Math. Proc. Cambridge Philos. Soc. 136(3): 513–524. DOI: 10.1017/S0305004103007497.
- Dunwoody, M. J. (1985). The accessibility of finitely presented groups. Invent. Math. 81(3):449–457. DOI: 10.1007/BF01388581.
- Engelfriet, J. (1985). Hierarchies of hyper-AFLs. J. Comput. System Sci. 30(1):86–115. DOI: 10.1016/0022-0000(85)90006-6.
- Gray, R. D., Steinberg, B. (2022). A Lyndon’s identity theorem for one-relator monoids. Selecta Math. (N.S.) 28(3):Paper No. 59, 53.
- Greibach, S. A. (1970). Full AFLs and nested iterated substitution. Inf. Control 16:7–35.
- Harrison, M. A. (1978). Introduction to Formal Language Theory. Reading, MA: Addison-Wesley Publishing Co.
- Hoffmann, M., Holt, D. F., Owens, M. D., Thomas, R. M. (2012). Semigroups with a context-free word problem. In: Yen, H.-C., Ibarra, O. H., eds. Developments in Language Theory, volume 7410 of Lecture Notes in Computer Science. Heidelberg: Springer, pp. 97–108.
- Holt, D. F., Owens, M. D., Thomas, R. M. (2008). Groups and semigroups with a one-counter word problem. J. Aust. Math. Soc. 85(2):197–209. DOI: 10.1017/S1446788708000864.
- Hopcroft, J. E., Ullman, J. D. (1979). Introduction to Automata Theory, Languages, and Computation. Addison-Wesley Series in Computer Science. Reading, MA: Addison-Wesley Publishing Co.
- Howie, J. M. (1995). Fundamentals of Semigroup Theory, volume 12 of London Mathematical Society Monographs. New Series. New York: The Clarendon Press, Oxford University Press, Oxford Science Publications.
- Jantzen, M. (1988). Confluent String Rewriting, volume 14 of EATCS Monographs on Theoretical Computer Science. Berlin: Springer-Verlag.
- Kobayashi, Y. (2000). Finite homotopy bases of one-relator monoids. J. Algebra 229(2):547–569. DOI: 10.1006/jabr.1999.8251.
- Král, J. (1970). A modification of a substitution theorem and some necessary and sufficient conditions for sets to be context-free. Math. Systems Theory 4:129–139. DOI: 10.1007/BF01691097.
- Lallement, G. (1974). On monoids presented by a single relation. J. Algebra 32:370–388. DOI: 10.1016/0021-8693(74)90146-X.
- Lyndon, R. C., Schupp, P. E. (1977). Combinatorial Group Theory. Ergebnisse der Mathematik und ihrer Grenzgebiete, Band 89. Berlin-New York: Springer-Verlag.
- Magnus, W. (1932). Das Identitätsproblem für Gruppen mit einer definierenden Relation. Math. Ann. 106(1): 295–307. DOI: 10.1007/BF01455888.
- Magnus, W., Karrass, A., Solitar, D. (1966). Combinatorial Group Theory: Presentations of Groups in Terms of Generators and Relations. New York-London-Sydney: Interscience Publishers [John Wiley & Sons, Inc.].
- Makanin, G. S. (1966). On the identity problem for finitely presented groups and semigroups. PhD thesis, Steklov Mathematical Institute, Moscow.
- Makanin, G. S. (1966). On the identity problem in finitely defined semigroups. Dokl. Akad. Nauk SSSR 171: 285–287.
- Malheiro, A. (2005). Complete rewriting systems for codified submonoids. Int. J. Algebra Comput. 15(2):207–216. DOI: 10.1142/S0218196705002220.
- Markov, A. A. (1947). On certain insoluble problems concerning matrices. Doklady Akad. Nauk SSSR (N. S.) 57: 539–542.
- Muller, D. E., Schupp, P. E. (1983). Groups, the theory of ends, and context-free languages. J. Comput. System Sci. 26(3):295–310. DOI: 10.1016/0022-0000(83)90003-X.
- Nyberg-Brodda, C.-F. (2021). A translation of G. S. Makanin’s 1966 Ph.D. thesis “On the Identity Problem for Finitely Presented Groups and Semigroups”. Available online at arXiv:2102.00745.
- Nyberg-Brodda, C.-F. (2021). The word problem and combinatorial methods for groups and semigroups. PhD thesis, University of East Anglia, UK.
- Nyberg-Brodda, C.-F. (2021). The word problem for one-relation monoids: a survey. Semigroup Forum 103(2): 297–355. DOI: 10.1007/s00233-021-10216-8.
- Nyberg-Brodda, C.-F. (2022). On the word problem for special monoids. Semigroup Forum 105(1):295–327. DOI: 10.1007/s00233-022-10286-2.
- Nyberg-Brodda, C.-F. (2023). On the word problem for free products of semigroups and monoids. J. Algebra 622:721–741. DOI: 10.1016/j.jalgebra.2023.02.007.
- Post, E. L. (1947). Recursive unsolvability of a problem of Thue. J. Symbolic Logic 12:1–11. DOI: 10.2307/2267170.
- Thue, A. (1914). Problem über Veränderungen von Zeichenreihen nach gegebenen Regeln. Christiana Videnskaps-Selskabs Skrifter, I. Math. naturv. Klasse, 10.
- Zhang, L. (1992). Congruential languages specified by special string-rewriting systems. In: Ito, M., ed. Words, Languages and Combinatorics (Kyoto, 1990). River Edge, NJ: World Scientific Publishing, pp. 551–563.
- Zhang, L. (1992). On the conjugacy problem for one-relator monoids with elements of finite order. Int. J. Algebra Comput. 2(2):209–220. DOI: 10.1142/S021819679200013X.