In case you’ve arrived late to the linguistics party, abjad is a term used to describe a writing style for a language (primarily) made up from consonants, where the reader is required to fill in the unwritten vowelled gaps for himself/herself. Perhaps the best-known example of this is the modern Arabic script, from the first four letters of whose alphabet the term “abjad” comes – in fact, it’s the Arabic word for “alphabet”.

So… might Voynichese be written in an abjad writing style?

Freelance systems analyst Joachim Dathe thinks so: inspired initially by the apparent similarity between the Voynich Manuscript’s (occasionally ornate) script and Arabic calligraphy, for the last few years he has been promoting and refining his theory that Voynichese is nothing more than Arabic written in an apparently unique (and rather idiosyncratic) abjad stylee.

Yet at the same time, Dathe also believes that the Voynich’s Arabic plaintext can only be extracted with difficulty, because in his particular Arabic reading of it:-
* Punctuation is absent
* Sentence structure isn’t at all obvious
* Word boundaries are often inexact or missing
* Spaces are often inserted inside words
* “Words often appear […arranged or ordered…] in a way which is not compliant with the Arabic language
His overall conclusion: “Obviously, the texts were dictated to a writer who did not master Arabic scripts.

For example, Dathe and his translator collaborator admit that their transliteration of the start of f1r yields a fairly jumbled (if not actually random) set of Arabic words, and offers the following interpretative translation of it (though naturally only one of many possible):-

A dervish continues to Elate, believing that he is forgotten, and when I am surrounded by his presence, I am in Eden. I am a naught in his life. When despaired of Iman Taha (the faith of The Prophet Peace be upon him), he was purified by an illusion, this is what my faith has inspired me yesterday. I see it distantly in the image of my mother. Do we blame he who offered his life? If you deny him you pierce my eyes, and if you embrace him your excuse will be realized.

Now, claiming a Voynichese abjad decryption that proves unrelated to the drawings and imagery (in Dathe’s case, of “religious content from Sufism”) isn’t unique: John Stojko’s (in)famous vowel-free proto-Ukrainian Voynich decryption of f18r – “What slanted Oko is doing now? Perhaps Ora’s people you are snatching. I was, I am fighting and told the truth. Oko you are fighting mischievously (evil manner). Ask this. Are you asking religion for your clan?” – springs to mind.

Of course, this comparison is hardly breaking news: Elmar Vogt noted much the same similarity in 2012, though going on to compare both sets of mangled-sounding plaintexts with Vogon poetry was perhaps a teensy bit harsh. Still, I do find it hard to disagree with Elmar’s sentiment that Dathe’s “approach is flagrantly naïve”: if there is a real, tangible difference between the way Stojko and Dathe both approached Voynichese, I certainly can’t see it. And if one is wrong for that reason, then so surely is the other.

(Remember: the long-established template for bad Voynich theories is (a) to conjure up a simple-sounding explanation, and then (b) to wrap that up in a long series of what are known as “saving hypotheses” – additional weasel-like meta-explanations that serve to explain away conflicts between that wonky core explanation and an inevitably long succession of inconvenient historical truths. Voynich theorists like to think of themselves as following in the giant decrypting footsteps of Young, Champollion, Ventris et al: but none of that august list put forward theories that needed extensive sets of saving hypotheses to explain away contingent problems.)

In many ways, though, simply grabbing hold of a given abjad script (whether Arabic or vowel-less proto-Ukrainian, if such a thing ever genuinely existed) as a starting point for decrypting the Voynich is without much doubt a poor way to proceed. The proper first question is instead this: what is the linguistic evidence that Voynichese is a script that has no vowels?

Linguists have long exercised their cunning (if you’ll excuse the reordered juxtaposition) by running text corpora through consonant-vowel analysis programmes: basically, they’re looking for hidden Markov models (HMM) with a small number of vowels that constantly recur without leaving consonants adrift in blocks (known as CVCV structure).

Reddy and Knight reported:-

[Jacques] Guy (1991) applies the vowel-consonant separation algorithm of (Sukhotin, 1962) on two pages of the Biological section, and finds that four characters (O, A, C, G) separate out as vowels. However, the separation is not very strong, and several words do not contain these characters.

At the same time, when they ran their own 2-state bigram HMM programme on Voynichese, the only feature they noted was the strong binding between the final letter of words (typically EVA ‘y’) and the space following it: which model they thought similar to Arabic script. So… it is Arabic, then?

Well… no. What this actually means is that a 2-state bigram HMM is woefully inadequate for analysing EVA-transcribed text. Essentially, EVA is a stroke transcription rather than a glyph transcription (hence many composite shapes are transcribed in two or three strokes): and so should never be used as the “raw” input to a statistical analysis programme. So they wasted their time using a 2-state bigram HMM: not even close. (Even if they didn’t use EVA, I would argue that a 2-state bigram HMM is thoroughly unsatisfactory for numerous other reasons, most of them connected with the behaviour of the EVA letters ‘a’, ‘e’, ‘i’, and ‘o’.)

In fact, arguably the fundamental statistical paradox about Voynichese as a script is that while it is riddled (quite literally, I suppose) with multiple overlapping internal structures, analysts have had very little luck building up Markov models to describe its behaviour; all of which is really quite the opposite of how you’d expect a well-formed language’s script to present. Even Jorge Stolfi’s long-standing “crust-mantle-core” model falls well short of being properly explanatory about the text. So, if Kevin Knight wants something Voynichian for his 2014 summer interns to get their teeth into, surely building up properly substantial Markov models for Currier A and Currier B (oh, and labelese too) would be an excellent starting point. Sort that out and we should all be sharing turkey and pepperoni pizza by Thanksgiving. 🙂

Jacques Guy applied Sukhotin’s algorithm to a glyph transcription, and so stood a better chance of getting sensible results than Reddy and Knight: yet I think the patterns in the text tell us a very much more complicated historical story than is captured by either of these two analytical tracks.

On the one hand, I think it is plain as day that we (the Voynich Manuscript’s ‘audience’, so to speak) are supposed to ‘read’ Voynichese in part as if it were a CVCV structured (non-abjad) thing. Look at the Pisces labels: these not only have a strong CVCV structure, but 25 out of the 30 also begin with the letter ‘o’ (presumably followed by a consonant, usually a ‘t’ or ‘k’ gallows character):-

otalal / otaral / otalar / otalam / dolaram / okaram / oteosal / salols / okaldal / ykolaiin / / oty / oky.ody / oty.or / okaly / otody / otald / otal.dar / okody / / chckhhy / otaly / otal.rar / otal.dy / okeoly / okydy / okees / otalalg / okasy / otar

There is also the heavy repetition of ‘or’, ‘ar’, ‘ol’ and ‘al’ throughout the text to consider, especially in phrases such as “or oro ror”. Once you visually ‘tune in’ to this kind of pairing, I think it becomes hard not to see the text as largely CVCV structured.

On the other hand, I think it is very nearly as plain that there’s something terribly wrong with this CVCV model of Voynichese. The simplest objection is that if it is correct, then only ‘o’ and ‘a’ seem to participate in CVCV structured words, making Voynichese a vowelled language with only two genuinely combinable vowels. Which would be a nonsense, right?

So if you think the Voynichese script is directly expressing an actual natural language, you’re stuck halfway between two extrema, because it’s neither consonanty enough to be an abjad (unvowelled) script, nor vowelly enough to be a proper abugida (vowelled) script. It’s a paradox, right?

Hence I personally think the only sensible conclusion is that Voynichese is a script that is neither an abjad nor an abugida, but is instead a covertext designed to resemble a plausible-looking language script (albeitone with too few vowels to register solidly as either category). The cryptographic truth falls between these either-or categorical boundaries erected by linguists, and in a much more subtle and devious way than linguists’ tools are able to handle comfortably. Good isn’t it?

Indeed, “There are more things in heaven and earth, Horatio / Than are dreamt of in your philosophy.

47 thoughts on “Is Voynichese an abjad?

  1. bdid1dr on February 8, 2014 at 7:12 pm said:

    Nick, there are vowels throughout B-408, many of them ‘in plain sight’. ‘a’ is most obvious and most apparent. ‘e’ is only recognized by comparing the sizes of the alphabet character ‘c’ and the smaller ‘c’ — which both are often used in combinations such as cae, aesc, ceaus,….cease, each, easceus, teleceseus. Again, I refer you to the “handwriting” used for the medieval “Book Hand” — and, in particular Crimean Gothic. The language which is written in B-408 is Latin. I’ve just finished translating the 12th folio of some twenty folios I’ve downloaded and identified the specimen and/or discussions (astrology, balnealogical, botanical, and mythological. Actually, very little ‘history’ so far. Only Busbecq’s signing off on B-408, f-116v from Ankara, on his way home to Vienna and Bohemia/Czechoslovakia. He briefly sojourned at the court of Rudolph II. After he departed from Rudolph’s court, he was killed by French soldiers before he could reach home.

  2. SirHubert on February 8, 2014 at 8:02 pm said:

    An abjad is a script or writing system, not a language. Major difference.

    Turkish was written in an adapted version of Arabic script (impure abjad) until Ataturk replaced this with adapted Latin (vowelled). The language didn’t change, just the script.

  3. SirHubert: you got me bang to rights 🙂 I know the difference perfectly well but didn’t express it as clearly as I should have done (I had a lot of ground to cover and was typing quite fast). I’ve overhauled the post to try to make it much more precise qua language vs script, please let me know if you spot any things that have slipped through the net, thanks!

  4. Nick, can I come on here where the serious stuff is aired? I won’t even mention the d word.
    With all these strangely written languages you fellows decipher, I have to ask; is the message worth the labour?

  5. bdid1dr on February 9, 2014 at 5:06 pm said:

    Guys, ae, ai, aeo, au, aeu. balnea, balance, caesar –. That much of the abecedary is apparent in most of the folios. More obscure vowels are ‘eo” as in ‘you’, ‘yearling’….. I’m still working on the silent ‘h’ which appears often in the Cockney dialect or in the Book of ‘Hours’. Another example I’ve used frequently is a tout a l’heure! (casual French for ‘see you later’. So, with every translation I’ve done, I have been looking for those ‘missing’ vowels which often only appear between the lengthy ligatured ‘c’——-‘e’ or ‘e’—–‘c’. A word I’ve been looking for is ‘appreciate’. eu-st so I cn tll Nck ‘ou’ mch I app-re-ci-ate his n-d-vrs!

  6. Petebowes on February 10, 2014 at 9:28 am said:

    Sirhubert: Abjad is a language if one man can read it to another and be understood.

  7. Petebowes on February 10, 2014 at 9:32 am said:

    surely ..

  8. Pete: abjad is a way of writing any language. Y cn wrt nglsh n n bjd wy f y wnt t. 🙂

  9. “An abjad is a script or writing system, not a language. Major difference.” that’s bert
    “abjad is a way of writing any language.” that’s nick

    “Is it worth the effort?” that’s me .. this is called trolling with a lure ..

  10. SirHubert on February 11, 2014 at 8:14 am said:

    Stephen: can I ask whether ‘abjad’ is a universally recognized and linguistic term for an entirely vowel-less script, as opposed to something like Arabic where only long vowels are noted?

    It’s obviously an acronym of alif-ba-jim-dal, of which the first letter is…er…a.vowel…

    More specifically, I think we need to have an agreed distinction between ‘Voynichese’ language and ‘Voynichese’ script, because at the moment the word is used for both, often confusingly or inappropriately.

  11. SirHubert on February 11, 2014 at 8:14 am said:

    *standard linguistic term

    baby helping me type this morning!

  12. SirHubert on February 11, 2014 at 10:21 am said:

    Stephen: also, and forgive the slightly off-topic post, I think that the final alif at the end of shukran is to indicate that it’s an adverbial form derived from an archaic accusative case? So I’m not sure whether the alif is actually representing a vowel as opposed to acting as a case marker? But my apologies if I remember wrong – my Arabic is not as good as it should be.

    There are exceptions (and don’t get me started on the alif maqsura) but most instances of vowels in Arabic script will be one of the three long vowels usually transcribed ‘a’, ‘i/y’ and ‘u/w’.

    I’m curious about the association of the final ‘y’ with a following space, which I can’t immediately find explicitly stated in Reddy and Knight’s paper. This sounds like the behaviour of the Arabic ta marbuta, which is a special letter only found at the end of a word – and so can only be followed by a space…

  13. Gentlemen: I was taught, by a young gentleman Turk, two most common sentences used when in polite society: Because of my deafness, he taught me phonetically t’shuk’ran: “Thank Allah and the Koran” Another polite phrase was ” lotvan betteme char” “I would like a cup of tea”….Because of my reliance on lip reading to comprehend what was being said, Haig tried to spell those phrases for me. Between translating Turkish into English, and trying write both Turk and English scripts, we all ended up in hysterics, Haig, his girlfriend Muriel (Armenian), me, and two gentlemen from the adjoining apartment. One of the two gentlemen gave up on linguistics and began a riff on Einstein’s theory of relativity…….

  15. I’ve been reading through a few of these comment threads and I have to ask, what’s the deal with bdid1dr? He seems to know what he’s talking about, but people don’t converse with him. Is it worth trying to follow his leads? Why don’t people entertain him? (With all due respect to you, bdid1dr, I only want to understand)

  16. I dunno Ralph, old biddly sounded like he had a lot of fun sitting around a table in Instanbul – I heard that’s what the place did for you, my best was Tangiers in ’66, following Ginsberg and Keoruac.
    Never found them, got waylaid in a big smoky room around a table with some friends instead.

  17. bdid1dr on February 13, 2014 at 3:22 pm said:

    Ol’ biddly is a biddy (female chicken) who has just gotten beady-eyed with old age (70 y.o this past September). Much of what I contribute to Nick’s various pages, I do in the hopes that he will be producing a sequel to his first book.
    My latest investigations (yesterday) into Aztec/Mixtec writing was interrupted by a trip into town. I forgot to bookmark it, but if you refer to “Classical Nahuatl language” you may find a brief discussion, a neat pronunciation chart, and some great, full-color glyphs, as well as references to its current-day use.
    Nick, please forgive my double posts. My elderly computer got a “tune-up” yesterday, so I think she may be able to keep up with my typing now.
    beady-eyed wonder 🙂

  18. bdid1dr on February 13, 2014 at 3:52 pm said:

    ps: If you do get to the page I’ve referred, please note that the writer briefly refers to the ‘devastating loss’ of thousands of Aztec manuscripts. In more recent times we’ve called those manuscripts “Codices”. Gorgeous, brilliantly illustrated in full color! The author of the page I’ve been referring also has done a terrific job of ‘illustrating’ his comments. Also note the ‘t’, ‘tl’, and ‘tz’ (no ‘tr’ because, I’m pretty sure, ‘tr’ was either ‘trilled’ or ‘tapped’, depending on context).

  19. bdid1dr on February 15, 2014 at 4:36 pm said:

    quetzal-coatl : There’s a South American word for you — which could be written and understood in Abjad: q-tzl-ctl . Nick, were you able to follow up my mention of a Jim Reed who works/worked for the San Jose Historical Museum (California)? I’m pretty sure he had access to the Spanish manuscript archive — or at least the microfilm reels.
    The context of that earlier post was my sympathetic reply to your plaintive likening of your research and blog discussions to the efforts of climbing a mountain and the lack of adequate oxygen.

  20. bdid1dr on February 17, 2014 at 5:14 pm said:

    I’m now comparing ‘N-as-ta-liq’ and ‘Na-hua-tl’ scripts with the script in B-408…….

    beady-eyed wonder: bdid1dr

  21. bdid1dr on February 27, 2014 at 4:15 pm said:

    Thanx, Ralph, for considering my various posts, as well as noting the lack of any response or follow-up from our host or fellow codiologists. Perhaps it is because I am not de-codifying; but rather deciphering word by word (and citing proofs and sources)?
    I do try to make my posts non-argumentative, and with a ‘touch’ of humor ‘here & there’ — if not downright punningful.
    beady-eyed (bdid) wonder (1dr) with a wink: 😉

  22. bdid1dr on March 3, 2014 at 5:04 pm said:

    Yesterday, my husband brought me a copy of “An Aztec Herbal — The Classic Codex of 1552” (William Gates compiler/commentary). His introduction to the “De la Cruz-Badiano herbal cleared up some of the current-day confused contributions to various of Nick’s ‘voynichese’ discussion pages.
    I am sticking to my identification/translation ‘xiuh – amolli’, as far as adding to my earlier comments on Nick’s “Nahuatl” pages. Stephen Bax may like to get a copy of the Dover publication (even though the cross-referencing does not include the root of a yucca plant which is also identified as Xiuh-amolli, soap plant-Saponaria americana-on page 122). A full-color illustration of the yucca-root saponaria appears on the inside of Dover’s publication cover (bottom-most left corner).
    So, Nick, I’m hoping you may follow-up on this latest post, maybe by X-referring to it on your Nahuatl discussion pages (which have now been ‘filed’?).

  23. bdid1dr on March 5, 2014 at 5:59 pm said:

    Yesterday,my husband handed me two more reference books: “The Codex Borgia” (Diaz-Rogers), and a fantastic pocket-book size translation dictionary by Fermin Herrera.
    Herrara clears up some of the slight discrepancies which appear in various translations of botanical manuscripts. So, what I am seeing in B-408 are ‘field notes’ which eventually ended up various RC cardinals’ family manuscript archives. Some of which ended up in Papal archives.
    Pope John II appears to have been one of the most benevolent and diplomatic religious leaders of our times.

  24. Dennis on March 6, 2014 at 1:47 am said:

    I remember that Jacques Guy published a Cryptologia article on applying the Sukhotin algorithm to texts not in the Latin alphabet. AIRC it identified ‘alef, yod, ‘ayin and vav in Hebrew as vowels, as you might suspect.

    Mark Stamp ran a HMM on Hamptonese, and said that it didn’t show consonants and vowels, let alone higher-order groups. However, I showed that if you relaxed his criteria, his HMM showed the same consonants and vowels as the Sukhotin algorithm.

  25. Diane on April 2, 2015 at 7:06 pm said:

    A footnote. Young, Champollion, Ventris et al only had to decipher a script. People working on the Voynich manuscript have to decipher, and provenance, and locate (in terms of time and place) and object as near to unique as dammit.

  26. Diane: all of which is why I like to say that cracking the Voynich Manuscript is like trying to break the triple jump world record with a single leap. 🙂

  27. Diane on April 2, 2015 at 8:05 pm said:

    Hey, let’s just channel it; everyone else does.

  28. Diane on April 3, 2015 at 9:37 am said:

    “triple jump record”…

    Interesting comparison. From my side, I suppose I’d see the process more like Fossy’s work.. you know, ‘Gorillas in the Mist’. Don’t know if I’d commit myself to thirteen years, but the parallel is close enough.

  29. SirHubert on April 3, 2015 at 12:11 pm said:

    CONGRATULATIONS! We have a winner of the Stephen Bax Award for Not Understanding Decipherments of Ancient Scripts and Languages!

  30. SirHubert: I’d be surprised if he would award a prize for anything apart from confirming his own eternal correctness. But… Life can be strange, eh?

  31. Diane: twenty years is about the norm for Voynich research. Newbold only got off with less because it killed him, it seems.

  32. Diane on April 3, 2015 at 7:06 pm said:

    Hubi seems a little miffed.

    (I meant that we also have to do the equivalent of Evans’ work, too, not just de Ventris’).

    Newbold was in his sixties when he died. His explanation was given some years before, I think, in a paragraph within the presentation to the College of Physicians. Too sad.

  33. Diane on April 3, 2015 at 7:38 pm said:

    I think it’s quite important not to forget that the manuscript is not a medium for conveying a cipher – sometimes the cipher people do seem to forget that. It’s just one element in one element (the written part of the text) within a manuscript whose aim was not to embody “a cipher”. The cipher – assuming there is one – was meant to join with the imagery in conveying a meaning which, if not identical, is at least reasonably supposed complementary. It’s all context, and meaning within a given society, the way any other form of expression is – so tomoimasu.

  34. Diane on April 3, 2015 at 7:44 pm said:

    Let me re-phase that:
    The cipher – assuming there is one – is just one element within one element (the written part of the text) and is not a definition of the manuscript or its primary purpose. Sometimes the cipher people seem to lose sight of that importance of context. The manuscript does not present to me as no more than a medium for conveying a cipher, but was intended first of all to conveying meaning ~ to discern which we need to establish likely context, and account for the complementary (if not identical) meaning embodied in materials and imagery. if you want to posit a hyper-intelligent cipher, and hyper-intelligent composer of the cipher, it is necessary to explain why they did not have access to better materials – just for a start.

  35. SirHubert on April 4, 2015 at 4:01 pm said:

    Diane: all I meant was that you don’t seem to understand very much about the decipherment of ancient scripts and languages. And I’m afraid your further comments confirm that in my eyes.

    If you are thinking of raising this topic in your projected book, I would suggest you consider asking some of your academic colleagues. Or, as a quick first step, the Wikipedia entry for the decipherment of Linear B is good enough to illustrate the problems with what you have said.

  36. Diane on April 5, 2015 at 12:47 am said:

    I trained as an industrial archaeologist, and we did a bit of background work in classical, and its history. If you look again, you may notice that Nick and I were swapping metaphors, not discussing whether or not the written part of this manuscript’s text is enciphered.

    Some latitude may be permitted the poetic and metaphorical, don’t you think?

  37. Diane on April 5, 2015 at 1:03 am said:

    I might add, for general information, that the book will speak about history and iconography. About the more formal aspects of the written text, I will comment, but about the various (still inconclusive-to-speculative) efforts to extract meaning from it, I do not see much point in repeating all that others have said and written, except in a brief summary as appendix.

  38. SirHubert on April 5, 2015 at 10:53 am said:

    Diane: no, you made a statement which is wrong in every respect. But I fear that Satan will skate to work before you acknowledge that, let alone bother to find out why.

    *shrugs and gives up*

  39. Diane on April 5, 2015 at 5:03 pm said:

    Bert, I have no idea which of the comments which passed between me and Nick you have taken such objection to. As far as I read them, we were describing how it *feels* to work on the Voynich manuscript, and I see you make no objection to its being likened to a triple jump – yet you do object to another metaphor, based on long-term careful observation.

    Perhaps the problem is not so much my metaphor, as that you have difficulty thinking metaphorically…
    *shrugs* –

  40. Goose on April 7, 2015 at 2:44 pm said:

    Dunning-Kruger effect:
    “for a given skill, incompetent people will:
    -fail to recognize their own lack of skill
    -fail to recognize genuine skill in others
    -fail to recognize the extremity of their inadequacy”

    Sir Hubert, it’s very kind of you to have tried to save this one from embarassment, but sadly, she’s incapable of accepting help.
    I cringe in advance at the insufferable display of conceit, presumptiousness and hubris that will accompany this projected book.
    The higher the monkey climbs, they say…

  41. Diane on April 8, 2015 at 12:14 am said:

    I would like to you read the comment above, and consider whether or not one might consider it a public libel.

  42. Diane on April 8, 2015 at 12:22 am said:

    Goose – this is not the Voynich mailing list. I think your difficulty is that I do not subscribe to the ‘central European’ theory, but believe the work we have a copy of earlier Jewish texts. If you’d care to discuss that matter, it would be relevant.

  43. Diane on April 8, 2015 at 12:30 am said:


    The quality of a man’s friends marks the quality of the man.

    Rather you had Goose than I.

  44. Diane on April 8, 2015 at 3:36 am said:

    Actually, Goose, the description is perfect for the way in which the Vms images have often been treated:

    “for a given skill, incompetent people will:
    -fail to recognize their own lack of skill
    -fail to recognize genuine skill in others
    -fail to recognize the extremity of their inadequacy”

    Thanks, I may well use that.

  45. Diane: it was a somewhat cutting and unpleasant comment, sure, but ‘libelous’ tends be determined less by content than about the quality of brief the recipient can afford.

  46. Diane on April 8, 2015 at 10:43 am said:

    So true – and it would be a farthing sought anyway. But I’ll take the gem of a quote in lieu.

  47. Goose on April 8, 2015 at 12:34 pm said:

    You have no qualms being slanderous towards other voynicheros and dissing their contributions on your website as well as on here all the time.
    Don’t dish it out if you can’t take it.

