Anyone who proposes that Voynichese works in a ‘flat’ (i.e. straightforward) way has a number of extremely basic problems to overcome.
For a start, there are the Voynichese’s ‘LAAFU’ (Emma May Smith’s acronym for Captain Prescott Currier’s phrase “Line As A Functional Unit”, though she now prefers to talk about “line patterns”) behaviours to account for. These relate to the curious ways that letters / words work both at the start of lines and at the end of lines, many of which are discussed by Emma May Smith here:
- Line-first words have a quite different first-letter distribution from the main body of words’ first-letter distribution
- Line-first words are slightly longer than expected
- Line-second words are slightly shorter than expected
- Line-final words frequently end in EVA ‘m’ / ‘am’
At the same time, there are also some odd PAAFU (“Paragraph As A Functional Unit”) behaviours to consider. The most famous of these is the way that the first letter of a paragraph (and even more so of the first paragraph on a page) has a significantly different distribution from elsewhere, one that strongly favours gallows characters (and in particular the single leg gallows EVA ‘p’).
But the other major PAAFU behaviour is that single leg gallows glyphs appear predominantly on the first line of paragraphs, and only rarely elsewhere (these are known as Tiltman lines, after my hero John Tiltman). You can see this throughout the Voynich Manuscript, right from Herbal A page f3r…
![](https://ciphermysteries.com/wp-content/uploads/sites/6/2020/08/f3r-266x300.jpg)
…to the Herbal B page f43r (which has an extra single leg gallows, but the remaining ones all sit on the first line of their respective paragraphs)…
![](https://ciphermysteries.com/wp-content/uploads/sites/6/2020/08/f43r-223x300.jpg)
…to the Q13 Balneo page f76v (where there are two extra single leg gallows, sure, but the rest of the page slavishly follows the pattern)…
![](https://ciphermysteries.com/wp-content/uploads/sites/6/2020/08/f76v-223x300.jpg)
So, even though the internal structure of Voynichese words changes significantly across the different sections (and that’s a separate topic entirely), this single-leg-gallows-mainly-on-top-lines-of-paragraphs Tiltman behaviour seems to remain essentially constant throughout them all.
This is an issue that has been floating round for decades, and I would be surprising if it had originated even from John Tiltman. More recently, Rene Zandbergen discussed it on voynich.ninja back in 2017, pointing out that this behaviour appeared – in his view – to be inconsistent with any model for Voynichese that was inherently uniform (which I call ‘flat’ here), whether linguistic, cryptographic or whatever.
So, the challenge to anyone trying to come up with some kind of theory for the Voynichese text is simply to explain why this unexpected behaviour is the way it is. What kind of mechanism could be behind it?
Q20 Paragraph-Initial Glyphs
For the rest of this post, I’m going to restrict my discussion to the twenty-three Voynich Q20 (‘Quire 20’) pages, simply because their lack of drawings make them particularly easy to work with.
The first thing to point out is that we have two single leg gallow behaviours (very frequent at paragraph starts, and very frequent on the top line of paragraphs) which overlap somewhat.
For example, f103r (the first bound page of Q20), has 19 starred paragraphs, of which 9 begin with the single leg gallow EVA ‘p’ (i.e. 47.3%). And if you count all the paragraph-initial p’s and f’s in Q20, you get:
Page | p | f | paras |
f103r | 9 | 0 | 18 |
f103v | 7 | 0 | 14 |
f104r | 5 | 0 | 13 |
f104v | 7 | 0 | 13 |
f105r | 7 | 0 | 10 |
f105v | 7 | 0 | 10 |
f106r | 11 | 0 | 15 |
f106v | 6 | 1 | 15 |
f107r | 9 | 1 | 15 |
f107v | 10 | 0 | 15 |
f108r | 6 | 2 | 16 |
f108v | 7 | 0 | 8 |
f111r | 4 | 0 | 6 |
f111v | 7 | 0 | 8 |
f112r | 8 | 1 | 12 |
f112v | 8 | 0 | 13 |
f113r | 7 | 3 | 17 |
f113v | 10 | 4 | 15 |
f114r | 5 | 2 | 13 |
f114v | 5 | 0 | 12 |
f115r | 4 | 2 | 13 |
f115v | 6 | 0 | 13 |
f116r | 6 | 0 | 8 |
Total | 161 | 16 | 292 |
The values for Q20 as a whole are remarkably consistent: there is a 161/292 = 55.14% chance that a paragraph starts with EVA p, and 16/292 = 5.48% chance that a paragraph starts with EVA f.
Given that ‘p’ makes up 1.03% of the glyphs in Q20 (‘f’ makes up 0.19%), ‘p’ is ~55x more likely to appear as the first glyph of a Q20 paragraph than it is to appear in any other glyph position: even ‘f’ is 28x more likely to appear paragraph-initial than elsewhere. That’s striking, and not at all flat.
Q20 Tiltman Lines
Q20 contains about 10700 words across about 1100 lines (I don’t have the exact figures to hand): 643 of these contain a single leg gallow, i.e. the raw chance any given Q20 word contains a single leg gallow = 643/10700 = 6%.
But whatever the explanation for p being so strongly biased to this paragraph-initial position, I think we should try to separate the single-leg-paragraph-initial behaviour from the single-leg-top-line (Tiltman) behaviour.
So if we remove the 292 paragraph-initial words, the raw chance that any non-paragraph-initial Q20 word contains a single leg gallow goes down to (643-292)/(10700-292) = 3.3%, which is our baseline figure here.
But what of top-line-but-not-initial Q20 words? Given that Q20 has 292 paragraphs, each with a first line containing (say) ten words, and we are removing the first word, we have 292 x ~9 = ~2628 top-line words of interest. Of these (by my counting), 353 contain a ‘p’, and 80 contain an ‘f’. Hence the probability that any given Q20 paragraph-top-line-but-not-initial word contains a single leg gallows is 433/2628 = 16.5%.
Similarly, the probability that any given non-top-line Q20 word contains a single leg gallows is roughly (643-177-433)/(10700-292*10) = 0.4%. So if we discount all the paragraph-initial words, words containing single leg gallows are about 16.5%/0.4% = ~41x more likely to appear on the top lines of paragraphs than on the other lines.
Q20 Neal Keys
One of the interesting things that has been noted about these single leg gallows on the top line of paragraphs is that they seem to often appear in adjacent words. This is something that Voynich researcher Philip Neal first mentioned in a Voynich pub meet a fair few years ago that he had noticed: at the time, I christened them Neal keys.
But even though this is a visually striking thing, is it statistically significant, particularly if we remove all the paragraph-initial single leg gallows first?
For non-paragraph-initial-top-line words, the raw (expected) probability that a pair of adjacent words both contain a single leg gallows would seem to be 16.5% x 16.5% = 2.7%.
My counts for the actual number of pairs of adjacent non-paragraph-initial-top-line Q20 words both containing single leg gallows (i.e. ignoring all paragraph-initial words) were 5/5/6/1/8/12/7/6/7/4/5/0/8/6/3/5/9/4/12/5/1/5/2 = 126 instances out of (353 + 80) = 433.
So, of the 292 x (9-1) = ~2336 potential adjacent pairs (discounting the end word of each top line), 126 instances points to a chance of 126/2336 = 5.4%.
So my conclusion from this is therefore that the phenomenon of Neal keys (pairs of words containing single leg gallows on the top line of paragraphs) is, while visually striking, only 2x the expected value.
To be clear, the phenomenon is definitely there, but the main factor driving it appears to be the very strong tendency for single leg gallows to appear on the top line of paragraphs, rather than the adjacency pairing per se.
Verification
I’ve done a lot of this manually, because I didn’t have sufficient automated tools to hand. So can one or more other Voynich researchers please verify these figures?
- I used the Takahashi EVA transcription
- I counted ch / sh / ckh / cfh / cph / cth as individual glyphs
- I didn’t count space characters in the percentages