I was deeply saddened this week to find out that Mark Perakh died last year, on 7th May 2013 in Escondido, Calfornia. He wrote with such vitality I never even stopped to consider his age: but he was in fact 88.


Perakh’s was a life of three professorial acts: first in Russia, then in Israel, and then finally in America. It seems that Perakh was goaded most frequently into action by a drive to resist that which he considered false knowledge – for him, dissenting sincerely meant fighting.

In recent decades, the things that goaded him to greatest action were the grand pseudoscience and pseudohistory constructions of fundamentalist Christian literalism: specifically, the Bible Codes (don’t get me started on that, or I’ll be typing all night) and literal Creationism. His book “Unintelligent Design” surely forms as good a sustained counterargument as needs to be written to the pro-creationist arguments of William Dembski et al.

Back in the world of cipher mysteries, for a short while Perakh brought his mathematical and statistical heavy guns to bear on the Voynich Manuscript’s confounding ‘Voynichese’ text: and his exemplary 1999 paper “APPLICATION OF THE LETTER SERIAL CORRELATION TEST TO THE VOYNICH MANUSCRIPT” is something I often suggest that researchers take a look at.

Unfortunately, since 2011 all the copies of it outside the Wayback Machine seem to have withered on the virtual vine: so I thought I’d take this opportunity to praise the man and resurrect his paper here on Cipher Mysteries, for anyone with an interest in statistical studies of the Voynich Manuscript.

So, here’s part 1 (his experimental tests and raw data) and part 2 (his conclusions): highly recommended stuff!

Incidentally, until just now I’d forgotten that Mark Perakh also ran his LSC (Letter Serial Correlation) tests on Gordon Rugg’s generated Voynichese-like text: and that it produced results that were close to those returned by the artificial gibberish text mentioned in Perakh’s paper, and quite unlike those yielded by Voynich A or B texts (which are very close to those characteristic of proper languages). In an online comment from 2004, Perakh expressed disappointment that Rugg had felt the need to gild his experimental lily for publication in Scientific American.

For a couple of weeks, I’ve been meaning to post about German Voynich blogger Elias Schwerdtfeger and what he calls the VMs’ “biological paradox”. His question is simple: why is it that the Voynich’s “biological” Quire 13 has both (a) complicated pictures of nymphs, tubes and baths, and (b) longwinded, redundant text? Surely, he asks, isn’t this combination somewhat paradoxical?

(To be honest, Elias’ post then goes off on a bit of a wild tangent: but given that it’s a good starting point and the whole issue of Q13 is a favourite of mine, I thought I’d step up to the line.)

Page f78r (one of the few that Leonell Strong was able to examine) has a number of good examples of this redundancy, in particular para 1 line 5’s “qokedy qokedy dal qokedy qokedy“, for which Strong’s 1945 worksheet #2 suggests the decryption “DUCTLE ROULLS THE GRAOTH COEMLI”.

This is the same piece of ciphertext about which Gordon Rugg asserted “This degree of repetition is not found in any known language (Sci Am, 2004). Of course, linguist Jacques Guy ferociously responded to this Ruggish in sci.lang firing off real-life counter-examples such as “di mana-mana ada barang-barang. Barang-barang itu…” As always, there’s a fair degree of truth in what both are saying: but the fact (as Elias points out) that only some parts of the Voynichese corpus read like “qokedy qokedy” is a pretty good indication that we can’t reduce this debate to an either-or between these two opposing poles. Essentially, it can’t be just a simple repetitive language if it’s not consistent throughout (and it isn’t): and beneath all the cryptographic window-dressing, there probably is some kind of meaningful language thing going on.

I’d say that Mark Perakh’s (1999) tentative conclusion on the language differences probably yields the most useful key to Elias’ paradoxical door. Mark wondered about the internal structural differences (i.e. within words) between Voynich Manuscript A and B language pages (and all the text that shades between A & B) and so carried out some tests: ultimately, his favoured explanation is that the A language is a more abbreviated & contracted version of the B language, but that beneath it all, they are still both expressions of the same thing. (Though Mark points to contraction probably being the main mechanism used).

So, the text in Q13 – as a B language object – therefore exhibits redundant probably because it is more verbose. This suggests that we should be looking to decipher the B text, simply because we stand less chance of being distracted by the A text’s arbitrary contractions.

My own take is a little more nuanced (though still hypothetical, lest I raise the hackles of the hypothesis police once more). Firstly, I suspect that the A pages were written first, and that these were trying to duplicate an existing document using a verbose cipher – meaning that a ciphertext line wouldn’t map to the same physical space as a plaintext line. The only way to fit it in was to aggressively abbreviate & contract… but this helped make the ciphertext more opaque.

Then, I suspect that the B pages were added, using smaller quills (say, eagle’s feather?) – because the smaller letter sizes took the pressure off the overall line lengths, the need for contraction and abbreviation was reduced. However, I think some aspects of the coding system changed (specifically the steganographic numbering scheme, but that’s another story!), making the B pages harder to break in a different way.

That is, I suspect that we have two types of ciphertext present in the VMs: a simpler cipher system A (but with a significant amount of contraction and abbreviation) or a more complex cipher system B (but with less contraction and abbreviation to distract us). And just to make things really difficult, there are probably system B pages that are also heavily contracted (i.e. the worst of both worlds).

And some people still wonder why computers can’t break the VMs! *sigh*