For a couple of weeks, I’ve been meaning to post about German Voynich blogger Elias Schwerdtfeger and what he calls the VMs’ “biological paradox”. His question is simple: why is it that the Voynich’s “biological” Quire 13 has both (a) complicated pictures of nymphs, tubes and baths, and (b) longwinded, redundant text? Surely, he asks, isn’t this combination somewhat paradoxical?

(To be honest, Elias’ post then goes off on a bit of a wild tangent: but given that it’s a good starting point and the whole issue of Q13 is a favourite of mine, I thought I’d step up to the line.)

Page f78r (one of the few that Leonell Strong was able to examine) has a number of good examples of this redundancy, in particular para 1 line 5’s “qokedy qokedy dal qokedy qokedy“, for which Strong’s 1945 worksheet #2 suggests the decryption “DUCTLE ROULLS THE GRAOTH COEMLI”.

This is the same piece of ciphertext about which Gordon Rugg asserted “This degree of repetition is not found in any known language (Sci Am, 2004). Of course, linguist Jacques Guy ferociously responded to this Ruggish in sci.lang firing off real-life counter-examples such as “di mana-mana ada barang-barang. Barang-barang itu…” As always, there’s a fair degree of truth in what both are saying: but the fact (as Elias points out) that only some parts of the Voynichese corpus read like “qokedy qokedy” is a pretty good indication that we can’t reduce this debate to an either-or between these two opposing poles. Essentially, it can’t be just a simple repetitive language if it’s not consistent throughout (and it isn’t): and beneath all the cryptographic window-dressing, there probably is some kind of meaningful language thing going on.

I’d say that Mark Perakh’s (1999) tentative conclusion on the language differences probably yields the most useful key to Elias’ paradoxical door. Mark wondered about the internal structural differences (i.e. within words) between Voynich Manuscript A and B language pages (and all the text that shades between A & B) and so carried out some tests: ultimately, his favoured explanation is that the A language is a more abbreviated & contracted version of the B language, but that beneath it all, they are still both expressions of the same thing. (Though Mark points to contraction probably being the main mechanism used).

So, the text in Q13 – as a B language object – therefore exhibits redundant probably because it is more verbose. This suggests that we should be looking to decipher the B text, simply because we stand less chance of being distracted by the A text’s arbitrary contractions.

My own take is a little more nuanced (though still hypothetical, lest I raise the hackles of the hypothesis police once more). Firstly, I suspect that the A pages were written first, and that these were trying to duplicate an existing document using a verbose cipher – meaning that a ciphertext line wouldn’t map to the same physical space as a plaintext line. The only way to fit it in was to aggressively abbreviate & contract… but this helped make the ciphertext more opaque.

Then, I suspect that the B pages were added, using smaller quills (say, eagle’s feather?) – because the smaller letter sizes took the pressure off the overall line lengths, the need for contraction and abbreviation was reduced. However, I think some aspects of the coding system changed (specifically the steganographic numbering scheme, but that’s another story!), making the B pages harder to break in a different way.

That is, I suspect that we have two types of ciphertext present in the VMs: a simpler cipher system A (but with a significant amount of contraction and abbreviation) or a more complex cipher system B (but with less contraction and abbreviation to distract us). And just to make things really difficult, there are probably system B pages that are also heavily contracted (i.e. the worst of both worlds).

And some people still wonder why computers can’t break the VMs! *sigh*

  1. Vytautas on April 1, 2009 at 12:01 pm said:

    Hi, Nick
    what about RNA and DNA systems (as per Stewe Ekwall: “Plants are RNA, humans are DNA…:) Something similar…


  2. Could be true… I think the last word has yet to be said on Steve Ekwall… 🙂

  3. Vytautas on April 1, 2009 at 12:43 pm said:

    And yet two thougthts by the way :
    1) Voynich manuscript may be book about cryptography written enciphered and all illustrations concerns enciphering methods used by author or by his coevals… Yet one hypothesis.
    2) Next thought a bit not correct, but… There is possibility Steve Ekwall is not real person… Aka pseudonyme of someone of list members wanted to say different opininion and test it 🙂

  4. 1) This has been suggested a number of times, though it doesn’t really seem to cover why there are both “herbal” pages (without much text) and “recipe” pages (full of text) – unless all those drawings actually mean something (in some way), it’s a fairly dramatic waste of space. 🙂
    2) Having spoken with Steve Ekwall, I can assure you that he’s a real live person (and very friendly, too), rather than some invention of a list troll. The more you know about Steve, the harder it is to explain what he thinks happened to him. 😮

  5. Emily on April 1, 2009 at 6:11 pm said:

    By the way, which language is the one Jacques Guy cites as full of repetition? (Are the reduplicated words plurals?)

    I think that if you designed a computer that could decipher the Voynich– using both cryptanalysis and historical/cultural/scientific knowledge– that computer would have to be of near-human intelligence and versatility(a universal Turing machine, if I have my jargon right).

  6. You’d have to ask Jacques…

    And you’re right: it’s a human problem, not a computer problem. For the moment! 🙂

  7. infinitii on April 1, 2009 at 6:44 pm said:

    That example from above is Indonesian, but I think Jacques also knows that Chinese is full of repitition as well — as for what those reduplicated words mean, no idea.

  8. [Note: GC submitted this comment via email, please excuse me if it ends up with the wrong avatar!]

    No problem here, except that the actual quote is “sutli ductle roulls the graoth coemli”. As odd as that may seem, it is derived mathematically, not through any linguistic means, and has a linguistic basis. A mathematical approach to the text is the only appropriate means of attack.

    It’s easy to speak of contracted or abbreviated text, but that doesn’t match up with other observations. Word length statistics in Voynich-101 are in line with a language that falls between Latin and English, so abbreviation of say, one glyph per word would expand these statistics beyond known European languages. The EVA alphabet already does this expansion, depending on how carefully you choose to view the related glyphs, and EVA transcriptions have no match to any language, surviving or otherwise. The herbal-A texts have a well-established base of 17 primary glyphs, this is slightly expanded in the B texts. The additional part of what we would consider a correct linguistic distribution are absorbed by the overuse of glyphs such as “o” in both the A and B texts. Linguistically speaking, in the A texts, three specific floating glyphs take up the missing portion of what would otherwise be considered a normal alphabet distribution. In the B texts this is expanded to include the “c”, which can be considered to be floating “vowels”, their definition most usually falling within the vowel range. Consider the most common “word” in both cases, the word “8am”, how can this be verbose? How can the multiple instances of “8″, “9″, and other glyphs as stand-alones be considered verbose? There exists a very large argument against this type of thinking.

    Yes, I could pick your whole argument apart, but that wouldn’t do any good because you reach conclusions that have no basis in fact. The bottom line here is that you know nothing about the cipher construction, and because you don’t, you are willing to hypothesize based only on the erroneous hypotheses of others. Yes, you’ve once again been busted by the hypothesis police.

    Now let’s look at reality. The differences in script translate to a line length that differs between 37 glyphs to 50 glyphs, it’s actually an average difference of 9 glyphs between the most compact and the most verbose of scripts. The longer script examples make use of narrower margins, and the average height and width of the individual glyph varies only slightly, so the perception that some pages are more compact than others is partially illusion. It could be something as simple as the author only having so much to say in a space, so he fills that space a bit larger than the other, extending the margins, nothing more – just an hypothesis.

    The “eagle feather” gives me a laugh. The travelling Jesuits were known for producing and selling small pocket bibles made of fake uterine vellum, and these were written using stiles, not quills. Such small print using a split stile, print much smaller than that used in the VMS. There are examples of these Jesuit bibles out there in the ether, very good examples actually. The question the “eagle feather” assertion brings up is one of available technology, and the Jesuit bible example demonstrates that the technology to write minutely was available though not used in the VMS, since the average height of VMS text approximates our modern 12 point type. Do you see why the “Hypothesis Police” exist? I haven’t even addressed your cipher hypothesis presented here, I’ve spent all my time addressing the underlying or “hidden” hypotheses that apparently direct you toward the fallacious conclusions you’ve reached. My usual take would be to ignore this post as that of an uninformed crackpot that doesn’t know his ass from a hole in the ground. To me an hypothesis must have some basis in supportable fact, not just hearsay from some other researcher who may be misinformed.

    So this pisses you off? Well that’s a good start. Tell me what makes you angry, and I’ll tell you why I think that way, in living color if need be. Let’s throw all that other shit out the window and get to the facts, that’s where it’s all going to happen anyway. That post you just made could have included a good deal of usable information that said something others could use as guidance. At one point or another we both need to be working off the same fact sheet, and we’re clearly not yet there.

  9. In Strong’s worksheet #2, “qokedy qokedy dal qokedy qokedy” is transcribed as “qotedy qokedy dal qotedy qokedy” (which is incorrect). And even in these five words, there are cipher inconsistencies:-
    qotedy qokedy dal qotedy qokedy
    797531 474135 797 531474 135797
    In alphabet #3, letter ‘o’ is deciphered as both ‘R’ and ‘O’: in alphabet #7, both ‘o’ and ‘d’ decipher to ‘O’. Please don’t read this as saying you’re right or you’re wrong about backing Strong’s cryptographic horse – as Jim Gillogly pointed out to Strong himself 30 years ago, it is hard to reconcile the kind of frequency-flattening cryptography Strong wrote about with strongly-structured sections (of which “qokedy qokedy dal qokedy qokedy” is perhaps one of the most strongly structured).

    As far as contraction & abbreviation goes: I suspect that people who don’t subscribe to Strong’s decryption would do well to read Mark Perakh’s paper – whether words are longer or shorter than English or Latin all comes down to what the correct level of transcription / tokenization should be, and we have substantially different views on that… but that’s OK. As for “dam”, my guess is that it is formed from two tokens (d + am) and that it codes for “&x”, i.e. ‘etc’. As for ‘8’ [d] and ‘9’ [y], they are probably not verbose: but I never said every letter had to be, did I? 🙂 And the Society of Jesus formed in 1534, so your travelling Jesuits and their split stiles might well need to have been time-travelling Jesuits too, depending on when the VMs was written. 🙂

    However, trading monkey faeces is all a bit pointless: unless you have swallowed a month’s worth of be-really-dogmatic pills in one hit, it ought to be pretty clear that there exists very large arguments against all types of thinking, VMs-wise – we’re none of us immune to criticism. The question is how to move forward – what that “fact sheet” would look like.

  10. Marke Fincher on April 4, 2009 at 1:58 pm said:

    Hi Nick,
    I’m fairly surprised that you posted GCs email on your blog given how rude parts of it are. I fail to see really what there is ever to gain by being abusive and insulting and I wonder if people resort to such behaviour when they cannot really rely on the strengths of their arguments alone.

    It’s the sort of “uncivilized” (and pointless) behaviour that unchecked hastened the demise of the Voynich Mailing List and I hope you can protect your blog from similar pollution.

    But leaving that aside; my take on this is:

    Several results of my analysis into the distribution and density of variation lead me to believe that the degree of contraction and abbreviation present in the underlying plaintext of VMS-A and B is not significantly different.

    If a much higher degree of contraction and abbreviation were used in B versus A then you would expect to see less redundancy and a denser, more compact distribution of variation.

    I do agree with you that the ‘A’ pages came first BTW.


  11. Hi Marke,

    Glen is particularly passionate in his defence of Leonell Strong’s claimed decryption against any perceived criticism of it – and if sometimes he goes over the line, I don’t mind too much as long as he backs up his arguments (something notably absent from the mailing list).

    Contrary to what a statistics professor might tell you, entropy, redundancy, and structure are all extraordinarily subjective terms because they rely on your having got the transcription and tokenization right in the first place. What is perhaps lacking most in the “language” debate is someone to step forward and map out precisely how B text differs from A text: this is because things like the presence of free-standing ‘l’ (a strong B feature, I’m pretty sure) alter all the stats – but is that ‘l’ really the same letter as the one in ‘al’ and ‘ol’?

    The model underlying your stats probably presumes that they are all the same letter, but I’m sure you can see how much would be altered if that presumption (and the 20-30 similar ones relating to tokenization) were wrong.

    Cheers, ….Nick Pelling….

  13. Marke Fincher on April 9, 2009 at 7:05 pm said:

    Your point about information theory and how any analysis from it only relates to the specific transcription (or act of extraction) of the information, which of course can be inaccurate, is an important point for sure Nick, and it should always be in our minds somewhere when interpretting any result, but I think it is possible to overstate the magnitude of this practical problem, as if it has the power to render all results meaningless, make all analysis futile and change black to white and make a pencil look like a hippopotamus! But it’s not like that.

    if there are distinctions made in the transcription which were not intended then what you are analysing includes a “noise” which dilutes the signal, but such a random noise will flatten and weaken relationships and not produce artificial ones of any strength. So strong relationships that you do find and study can not be complete “mirages” produced by an inaccurate transcription.

    Distinctions that were intended but missed by the transcription are more tricky perhaps, but mostly just means that you are looking at the thing through a fuzzy lens and missing some of the detail as a result….but what you can learn from the detail that you can resolve is not meaningless and is still very useful. And most uncompressed information is surprisingly error tolerant.

    So we should be careful in stating the problems of transcription to not pull the Rugg from under the feet of people who are willing to put effort into analysis, which is pretty hard work for the most part but generally very worth it and almost certainly the only thing that will lead to a decryption (although perhaps not on its own).


  14. Of course we have a problem. We simply do not know enough about the writing system used in the VMs, and we know nothing about the underlying language. (Not even that there is an underlying language.)

    In every transcription system is a whole set of assumptions about the nature of the writing system, and when I do some analysis in the transcriptions, I try to use the transcription alphabet of which I think it fits best in the matter of my analysis. For things like word-length distributions I feel (yes, it isn’t scientific) EVA as inappropriate and transform the transcription mechanically to Currier instead, which gives a much better match between ASCII characters and the things in the script I believe (it isn’t scientific again) to be a “letter”.

    I am a little familar with the great Voynich 101 transcription and its underlying assumptions. Some of them are worth to be considered, especially the diffrent florishes in the EVA “sh”-glyph, and some other seems (insert my default bracketed phrase here) a little bit too finetuned to me.

    But to see the thing I called the “biological paradox”, one don’t need any transcription at all, only a hires or midres image from the Beinecke website. It is obvious, even to someone who see the page for the first time. (Even a friend of mine which know nothing about the VMs is able to see it.) There are lots of similar formed “words” on the pages in the biological section, much more than in any other section of the Ms. The degree of similarity may differ with the transcription system used, but the fact will remain: The section of the VMs which contains the most disturbing and unique kind of illustrations seems to contain of the most redundant kind of “text”. This is really weird.

    It is not the kind of “hard stuff” I wanted to post to the mailing list, but it had inspired me to do some (explicitly called as such) speculations about a “psychological feedback process” similar to parapsychological (this is a good word to remove the last similarity to things one can call scientific without becoming ridiculous) “automatic writing”, which I posted in my german blog. The speculative stuff is unripe and may not worth further research at the moment. But the “biological paradox”, as I named it, remains…

    And — of course — please excuse my bad english. But I hope it is easier to understand than a google translation of my — sometimes rather “cryptic” — german. 😉

  15. Hi Elias,

    I liked your “biological paradox” idea very much (qokedy qokedy dal qokedy qokedy, indeed!), and it seems to have inspired plenty of commentary here, which is good – all food for thought. 😉

    I think I should post soon about how transcription and modelling strategies affect the statistics so massively, it’s a big topic that nobody seems to cover properly…

    Though my grasp of Geman is patchy (for my sins in a past life, I chose to learn Ancient Greek instead at school), Google Translation is normally good enough to get me 50% of the way there (though I do have to work to get the other 50%). But surely you have much better things to do than pore over your server logs to see when I’m dropping by? 🙂

    Cheers, ….Nick Pelling….

  16. […] Google Translation is normally good enough to get me 50% […]

    Sounds similar to me reading dutch. I understand 50%, but I do not understand 100% after reading the text twice… 😉

  17. Nick, the old links to Mark Perakh’s paper no longer work.

    I’ve found another link but your spam filter allows none in a post so I’ve tried putting in the ‘website’ box.

  18. In case Elias should ever see this – If I’d known about your mention of those travelling Jesuits’ bibles, I’d have mentioned it in a recent post of mine entitled ‘Jewish influence’.

    Sorry I can’t include a link or address – Nick’s spam filter has forbidden it.

    The problem is that McCrone said all the text was written with a quill, and the tiny, tiny writing is too small to be sure.

  19. The Society of Jesus (Jesuits) was formed in 1534 and approved by Pope Paul III in 1540. It is about 100 years too late for God’s Marines to have used the MS 408.

  20. Xplor
    I’m not interested in Europe’s sectarian religious issues, but as a point of fact I’ve never seen the term ‘marines’ applied to them in any document from between the 16th-18th centuries.

    description as ‘marines’ has not turned up in any historical document I’ve seen which was dated between the sixteenth-eighteenth centuries. If you have, I’d be glad of the reference.

    Glen [not Elias] said (via Nick)
    “small pocket bibles … written using stiles, not quills. Such small print using a split stile, print much smaller than that used in the VMS.”

    When he says ‘print’ I think he means ‘script’ (?) because he also says

    … the Jesuit bible example demonstrates that the technology to write minutely was available…

    Glen hadn’t seen the minute print in fol.9v,

    What interests me is the ‘hand’ in which these bibles were written, and when the first examples occur.

    As many have pointed out, the pigments and inks have not been dated, and some of the pigments, at least, seem to have been added late: so might the smallest line of script.

    McCrone said the whole thing was written with a quill, but Nick has long suspected that a stylus might have been employed at some stage of production, and I doubt McCrone tested that small string.

    So I’d like to see some examples.

  21. Xplor, this complies with my theory, that the Jesuits got the VMS in hand in 1570, when they arrested Gerolimo Cardano at the University of Bologna because of blasphemy. Cardano was 1560-1570 professor in medicine and mathematics at the University of Bologna. He also specialized in cryptography.

  22. Xplor, Menno
    is there really any need to invent stories depicting the manuscript as a bone over which imagined villains and ‘good guys’ battle?

    The manuscript appears first with the signature of a man raised by the Jesuits, passes then to three persons in sequence who were educated by Jesuit scholars and is finally given as gift to another Jesuit.

    Since all non-perishables given as gifts to an individual Jesuit became not his, but the community’s, why this common idea that Jesuit ownership must be the result of theft or skullduggery? They had libraries filled with books donate or bequeathed to them. Just because we can’t read it there’s no need for all this Brown-ish novelising, is there>

    If there were any theft involved, for all we know it occurred ilong before, and legally ~ given the appalling laws in effect throughout medieval and later Europe.

  23. qokedy qokedy dal qokedy qokedy

    a league, a league; a league [and] half a league..

    – sorry, Alfred

  24. Tricia: perhaps that’s why people take so much poetic licence with Voynichese?

  25. Had the Jesuits found the author of MS-408 we would not know of the book today. It did not make the Index of Forbidden Books because they could not read it. Lucky for us the Jesuits were started a hundred years latter.

  26. Diane, If you know how the VMS came into the hands of the Jesuits, why don’t you tell us ?

  27. Menno,

    it’s no secret that Jakub Hořčický (Lat: Jacobus Sinapius) later ennobled as … of Tepenec .. was brought up by the Jesuits and that what some people try to represent as child slave labour in the kitchen was nothing more than a child’d regular household chores. And he really enjoyed working in the garden and cooking up medicinal mixtures all his life – so obviously not too many emotional scars from helping around the house.

    For reasons which are beyond my understanding, various wiki articles find it difficult to accept religious pluralism even in the 1600s, and write foolish things such as Jakub wrote a ‘pro-Catholic’ pamphlet. He, like so many others, wrote in defence of the Christian sect he believed more theologically correct – and he was a Catholic, so what’s the fuss?

    Such wiki articles often give the impression that their own bias towards one or other of the various Christian sects if te ‘only’ good one – but such petty bickering is bewildering to outsiders: who cares?

    fact is – despite the omissions which allow readers to believe that Jakub lived with his parents and was forced to be a child-labourer, in fact Jakub like many others was a child ‘cast upon the parish’ and most of the charitable work, then and now, continued to be left to the religious orders, as always.
    He was cared for; his education was paid for; they found him a good and well-respected pharmacist to whom he could be apprenticed, provided him with a ‘lab’ and a garden and cared for him while he was dying. He left them most of his property, but perhaps not the Vms – not for deep and sinister reasons, but because it was the sort of book you give to someone with the same intellectual interests.

    Down the line, Marci gives it as a gift to Kircher.

    Once it reached Kircher it became not personal property but part of the community’s property held in common.

    When Kircher died, it would by default have gone to a school or library owned by the order, and as far as we know it did – Voynich’s widow said so.

    The story may be a bit more complex, but those were complicated times.

  28. Diane: I really don’t think that the way you seek to downplay the gulf separating Catholics and Protestants at that time is historically supportable. Kingdoms were split apart over it, wars were fought over it, it was a Very Big Deal Indeed.

    Furthermore, we have (what I think is) excellent evidence that the Voynich Manuscript passed by some means from the Catholic side of this divide to the other and then back again. Your trying to retrospectively impose some kind of Catholic continuity of ownership runs directly counter to this basic provenance evidence.

    Sinapius didn’t stay merely a Jesuit child, but became both the Imperial Distiller and hugely rich from sales of his distillations. As cheeses of the time go, he was a big one.

  29. Nick, you say:-

    “Furthermore, we have (what I think is) excellent evidence that the Voynich Manuscript passed by some means from the Catholic side of this divide to the other and then back again.”

    If it’s permitted, could you be more precise? Are you allowed to fill in the spaces below?

    …. excellent evidence that the Voynich Manuscript passed by some means from … [Person/Place A]…. to …[Person/Place B]…. and then back again to … [Person/Place C].

    Europe’s half-millenium-old religious propaganda does not interest me. So much does it not interest me that as soon as someone says all members of sect-A are ‘good, rational, reasonable and devote to aristocratic rule’, I’ll hop right in there and say exactly the same about people in sect-B.

    And you know, don’t you, that odds are I’ll be right – mostly.

  30. Diane: Sinapius -> Baresch -> Marci.

    As an historian, what I find interesting is religious propaganda that incites wars rather than just pamphlets. This is definitely the former rather than the latter.

  31. As an historian and archaeologist what I find troublesome – and not at all rare these days – is when nationalist and religious sentiment turns reporting evidence into an exercise in ‘urging’ the reader to adhere to a preferred ‘line’.

    But Nick – while you’re about. Nice picture of a Venetian watermark. I do hope no-one tries to use this to argue the Vms’ ‘September’ an homage to some 15thC figure.
    Venetian paper

  32. Diane, It has been stated that Jacub Horczky (Sinapius) has been the owner of the VMS, whether he received it from emperor Rudolf II (general view) or through internal Jesuit lines (my view) has not been certified yet.

    If the ms bought by Rudolf II from John Dee indeed was the VMS (general view) or an other ms from Roger Bacon is just a rumour (hearsay). There is no evidence, but the Roger Bacon-John Dee-Rudolf II connection is regarded as unlikely. This leaves the possibility that Horczky got the VMS through Jesuit lines. My question to you was, how the Jesuits got the VMS in hands, but that question you did not answer. That is the point, where I have brought up Gerolimo Cardano. It could well be, that he was the one responsible for the second binding of the VMS. It’s just a theory for now.

  33. As far as I know, though Nick says tere’s evidence otherwise, it became property of the Order when it was given to Kircher.

  34. Diane: my position, for what it’s worth, is that Sinapius (Catholic) owned it, then somehow Baresch (a Bohemian Protestant) owned it, then Marci (a wavering Catholic) owned it, and then *possibly* Kircher owned it. But it ended up being owned by Jesuits for sure. 🙂

  35. Diane on April 30, 2015 at 3:29 am said:

    Marci a “wavering Catholic” – really? From the evidence I’ve seen so far, I can’t adopt such an opinion.

    And given that he remained close friends with Kircher and with Baresch and again with Kinner, I don’t see that the matter of which formal Christian sect a person followed could have been *such* a big deal in seventeenth century Prague. Being a Hussite or Adamite or so would be unacceptable to the more conservative, of course.

    But really I wanted to emphasise one sentence from your post:

    “abbreviation of say, one glyph per word would expand these statistics beyond known European languages..”

    So true.


