Since the Voynich Manuscript surfaced in about 1912, many of the best-known codebreaking experts have studied its writing (‘Voynichese’) in depth. Of them, many have concluded that it was written using a cipher system that was (a) stronger than a simple (monoalphabetic) substitution cipher, yet (b) mathematically weaker than a polyalphabetic cipher.
If the University of Arizona’s 2009 radiocarbon dating of the Voynich Manuscript’s vellum (which points to the first half of the fifteenth century) is correct, the most likely reason for (b) becomes blindingly obvious: polyalphabetic ciphers (such as those of Leon Battista Alberti, Abbot Trithemius, and Vigenère) hadn’t yet been invented.
So, does that mean that all pre-polyalphabetic ciphers were easy? Errm… nope. In fact: not even close.
Fourteenth Century Cryptography
Even though Gabriele de Lavinde’s 1379 collection of Vatican ciphers were, at heart, simple (monoalphabetic) ciphers, many also included “nulls” (special cipher shapes that code for nothing at all, and were added into ciphertexts specifically to try to misdirect codebreakers). In the hands of a tricksy encipherer, this can already become not at all straightforward to crack.
Even the very clever CryptoCrack doesn’t have a tool for predicting / identifying nulls in a given ciphertext: and it turns out (I believe) that this is a significantly harder technical challenge than you might think.
Moreover, many of the ciphers in Gabriel de Lavinde’s cipher ledger also contained a nomenclator: this was a list of typically a dozen-or-so shapes enciphering entire words, like a cross between a cipher and a code. (Broadly speaking, a ‘cipher’ enciphers a message a letter at a time, while a ‘code’ encodes a message a word at a time: so nomenclators blur the line between the two).
However, it’s far from clear (to me at least) whether nomenclators were added in the 14th century for security, speed or brevity. I suspect that to insist that it was just a matter of security would be to project principles of Schneieresque computer science onto the codemakers and codebreakers of the 1300s: the true answer would be some vague (and probably unworked-out) combination of all three.
Fifteenth Century Cryptography
At the beginning of the 15th century, however, things started to shift (slightly) in the world of codemaking. 1401 was when a secretary at the Duchy of Mantua produced the following cipher alphabet for corresponding with Simeone de Crema:
Now, in many ways, this is a particularly stupid cipher alphabet, because the top (core) line maps each character in the alphabet to its reversed-alphabet equivalent (i.e. ABCDE –> ZYXUT and vice versa). Yet what is simultaneously clever about it is that it allocates multiple shapes to each of the five vowels.
To be honest, I think it would be a bit of a stretch to infer from this (as David Kahn tries to) that the notion of defending against frequency analysis-based attacks must necessarily have been entering cryptographers’ minds as early as 1401. Rather, it seems many times more likely to me that this trick (now known as “homophonic substitution”) was originally devised for a far more mundane reason: to make it harder for codebreakers to tell which letters are vowels and which are consonants.
Fast forward to the middle of the fifteenth century (probably circa 1450-1455), and we can still see the same palette of tricks in action in the following (undated) cipher alphabet in the Tranchedino cipher ledger from Milan:
Apart from not using the same alphabet backwards as the base cipher alphabet, it would seem that not much has changed since 1401: the vowels are still obfuscated with multiple homophonic alternatives (though with only three different shapes per vowel here, rather than the four shapes per vowel used half a century before).
The more observant among you will also notice that the (formerly Tironian) shorthand abbreviation ‘9’ gets its own cipher shape, as does ℞ (i.e. Rx, if your prehistoric browser can’t render Unicode character ‘U+211E’).
However, the later cipher alphabet also has special cipher shapes for doubled letters, a few other common shorthand abbreviations (p, etc), and a few more nulls than before:
The nomenclator is noticeably beefed up, with this particular cipher boasting more than eighty special entries:
Another Mantuan Cipher (1450)
Given that the 1401 cipher was from the Duchy of Mantua, it’s interesting to have a look at a Mantuan ducal cipher from 1450 in the Tranchedino ledger. This now has two homophonic shapes per consonant (except for x, z, and the ‘9’ shorthand shape), and three homophonic shapes per vowel:
It then has a mini-codebook of common words (Come, Quando, Quanto, Non, etc) and some nulls:
Interestingly, this is followed by an entirely new section, with arbitrary shapes standing in for a whole load of syllable groups (ab, ac, ad, af, ag, etc):
Finally, the page finishes up with roughly the same (small) size of nomenclator as had been in use in Mantua half a century previously:
So, You Call This “Progress”?
There is a long-standing (and widespread) tendency among writers on cryptography to present the development of ciphers in the fifteenth century as a kind of prototype of the modern arms race.
It’s perfectly true that, as the number of parties enciphering messages grew (along with the first flush of modern diplomacy) in the mid-15th century (many historians quite reasonably date this to the 1454 Treaty of Lodi), so too did the number of people who became experienced at cracking them.
However, there seems to me to be no evidence suggesting any kind of awareness of frequency analysis in the West in the fifteenth century. While Leon Battista Alberti’s short book on ciphers (“De Cifris”, 1466/1467) did cover this very well, he appears to have devised the abstract principles himself: and the contents of his book seem never to have been shared with anyone outside the Vatican. Similarly, al-Qalqashandi’s (1412) Arabic encyclopaedia entry on frequency analysis (mentioned in Kahn) appears never to have been transmitted to the West.
Don’t get me wrong, cryptology and cryptography both genuinely advanced in the sixteenth century: but in the fifteenth century, code-breaking had no mechanisms, no abstract methodology to work from: and fifteenth century code-making relied, by and large, on exactly the palette of tricks that were in place by 1450 or so. The only noticeable difference was that of scale: more homophones, more syllables, more nulls, and bigger nomenclators.
What, Then, Of The Voynich Manuscript?
In almost all practical senses, I think it’s fair to note that the Voynich Manuscript stands outside the cipher-making traditions you can see embodied in the cipher alphabets described above. It would seem to have too few cipher shapes to be using homophonic cipher tricks, doubled letters, a nomenclator of commons words, or even nulls.
And yet it dates to this precise period: and – arguably the most telling cryptanalytical feature of all – there is still no modern-day consensus as to which shapes are vowels and which are consonants. Even now, the letters that resemble ‘a’, ‘e’ (sort of), ‘i’, and ‘o’ continue to convince people seeing the Voynich Manuscript with fresh eyes that they ‘must’ not only look like vowels, but ‘must’ also be vowels. However, the closer you look at these, the unlikelier and wobblier this conclusion gets.
So, here’s your paradox for the day: even though the Voynich Manuscript is almost certainly not using the homophonic trick of using multiple letters for each of the vowels that was in use as early as 1401, it very much seems that its author devised or adapted an alternative way of concealing the plaintext’s vowels, i.e. of answering the same basic cryptographic ‘problematique’.
But how did it do that?