Something is ‘liminal’ if it sits right on a kind of perceptual boundary: so it is surely the Voynich Manuscript’s liminality – that is, its apparent inability to be included in or excluded from any category or pigeonhole – that makes it such an infuriating object to study. Why can’t we prove or disprove that it is a language, a cipher, a shorthand, indeed an anything? Why, centuries after it was constructed by person or persons unknown for reasons unknown, are we still unable to drag any part of it kicking and screaming into the light of certainty?
Yet a recent email here from BC helps demonstrate the difficulties we face when we try to do this. He (very reasonably) asks:
“What do you think best explains the lack of repeated sequences? (i.e. there are almost no repetitions of any group of 3+ consecutive words). I would think that disproves the hypothesis of a pure natural language already.”
It’s a fair point (and I’d add that it works equally well as a disproof for both “pure natural language” and simple substitution ciphers, which are almost exactly the same thing). Moreover, many of the repeats that you do find within the Voynichese corpus are qokedy/qokeedy blocks, words which combine a small information content with a strong affinity for sitting next to one another (as I recall, but please correct me if I’m wrong) such that trivial repeats of these are statistically almost certain to be found somewhere in the text.
Yet conversely, it could be argued that if a pair of instances were to be found where a longer non-trivial block is repeated, that would surely throw a statistical spanner of improbability into that reasoning’s smoothly rotating spokes, in much the same way that the statistical improbability of the Gillogly strings militate strongly against most non-DOI-based readings of Beale Paper B1.
And so it is with all that in mind that Torsten Timm points – in his interesting and challenging paper that I will discuss in more detail another day – to a particularly intriguing (nearly-)repeating sequence pair, both halves of which are on page f84r:-
<f84r.P.3> shedy qokedy qokeedy qokedy chedy okain chey
<f84r.P.10> shedy qokedy qokeedy qokeedy chedy raiin chey
This is surely as close to a “Gillogly sequence” as we get in Voynichese. In fact, this to me is very much as if we are looking through a gap in the confounding clouds, insofar as it seems that the same (or at least very similar) plaintext sequence is being processed in two slightly different ways by the same system to yield two extremely close Voynichese sequences.
But yet the almost complete absence of any other reasonable-length sequence pairs throughout the Voynich Manuscript’s hundreds of pages speaks loudly against the idea that what we are looking at is either a natural language or just about any straightforward cipher. So this pair is arguably most useful as a demonstration of how weak many of our current proofs and disproofs are.
As a consequence, my current answer (to “What do you think best explains the lack of repeated sequences?”) would be that the Voynichese text seems to have been consciously constructed in such a way to avoid including non-trivial repeating sequences (i.e. I don’t really include “qokedy/qokeedy” sequences in this).
But this comes with a caveat: that this “Timm pair” is then probably the keenest example we have of a slip-up in the generally excellent execution of a tricky system specifically designed to avoid including non-trivial phrase repetitions (and which almost completely managed to succeed in this ambitious aim).
Yet because it contains three trivial consecutive qokedy/qokeedy words, it plainly suffers from the weakness that it existence might just be a statistical coincidence, of the kind of Dave Oranchak sees suggested so often for Zodiac Killer cipher patterns. Hence its inherent liminality: we just can’t tell for sure whether it’s a break in the system or a freak occurrence fooling us into thinking it’s a break in the system.
…unless you happen to know of any other “Timm pair”-like sequences that are even more solid?