Though I haven’t posted much about the Voynich Manuscript here recently, I have actually been doing a lot of research into it (for Curse 2), as well as a lot of thinking about how to decrypt it, mainly by trying to devise cryptological mechanisms that stand even a passing chance of achieving that (an “attack vector”, if you like).

Yesterday, I came up with something new (well, new to me, anyway). Having recently read a pile of books on WW2 codebreakers (e.g. the excellent “The Man Who Broke Purple”) and WW2 codebreaking (thanks to the whole cipher pigeon thing), an idea fresh in my mind was that one way to break a cipher system would be to get multiple instances of the same plaintext enciphered in different ways, and use that to understand how the cryptographic framework works. So… might there be any plaintext in the Voynich that was enciphered multiple times?

Well: it’s well known that a fair number of the herbal pictures reappear as small versions in the two pharma sections. These might well be visual recipes (as normally believed); or a visual cross-referencing hack (in the Quattrocento style of Mariano Taccola); or nonsense; or something else entirely. At the very least, however, this does tell a visual story about the content: that the drawings are far from completely arbitrary [as would be convenient for some people’s Voynich theories], but instead are consistent and rule-based, even if we can’t yet discern what those rules are.

But what struck me as possibly offering us a chink into the Voynich’s cryptographic armour is the presence of two herbal pages as well as a recipe page all containing what seems to be the same plant… f17v, f96v & f99r:

voynich f17v

voynich f96v

voynich f99r bottom recipe

Might it be that these three pages not only contain the same plant, but also the same (or very similar) plaintext enciphered in different ways? As readers of The Curse of the Voynich will doubtless know, I have a whole constellation of long-standing hunches about how Voynichese works: but finding effective ways of testing all these ideas has proved immensely tricky.

Anyway, let’s have a look at the texts (in EVA) [I’ve used Stolfi’s transcription as a broad starting point, and ‘bold’ed the Neal keys on the two herbal pages’ top lines]:

f17v.P.1;F pchodol chor fchy opydaiin odaldy –
f17v.P.2;F ycheey keeor ctho dal okol odaiin okal –
f17v.P.3;F oldaik odaiin okal oldaiin chockhol olol –
f17v.P.4;F kchor fchol cphol olcheol okeeey –
f17v.P.5;F ychol chol dolcheey tchol dar ckhy –
f17v.P.6;F oekor or okaiin or otaiin d –
f17v.P.7;F sor chkeey poiis chor os saiin –
f17v.P.8;F qokeey kcha rol dy chol daiin sy –
f17v.P.9;F ycheol shol kchol chol taiin ol –
f17v.P.10;F oytor okeor okar okol doiir am –
f17v.P.11;F qokcheo qokoiir ctheol chol –
f17v.P.12;F oy choy keaiin chckhey ol chor –
f17v.P.13;F ykeor chol chol cthol chkor sheol –
f17v.P.14;F olor okeeol chodaiin okeol tchory –
f17v.P.15;F ychor cthy cheeky cheo otor oteol –
f17v.P.16;F okcheol chol okeol cthol otcheolo –
f17v.P.17;F m qoain sar she dol qopchaiin cthor –
f17v.P.18;F otor cheeor ol chol dor chr oreees –
f17v.P.19;F dain chey qoaiin cthor cholchom –
f17v.P.20;F ykeey okeey cheor chol sho ydaiin –
f17v.P.21;F oal cheor sholor or shecthy cpheor daiin –
f17v.P.22;F qokeee dar chey keeor cheeol ctheey cthy –
f17v.P.23;F chkeey okeor char okeom =

f96v.P.1;F psheas sheeor qoepsheody odar ocpheo opar ysar aso* –
f96v.P.2;F ytear yteor olcheey dteodaiin saro qoches ycheom –
f96v.P.3;F dcheoteos cpheos sar chcthosy cth ytch*y daiin –
f96v.P.4;F dsheos sheey teo cthy ctheodody –
f96v.P.5;F tockhy cthey ckheeody ar chey key –
f96v.P.6;F yteeody teodar alchey sy –
f96v.P.7;F sheodal chor ary cthol –
f96v.P.8;F ycheey ckheal daiins –
f96v.P.9;F oeol ckheor cheor aiin –
f96v.P.10;F ctheor oral char ckhey –
f96v.P.11;F sar os checkhey socth –
f96v.P.12;F sosar cheekeo daiin –
f96v.P.13;F soy sar cheor =

f99r.P4.13;F tol.keey.ctheey-{plant}
f99r.P4.14;F ykeol.okeol.o!ckheo.chol.cheodal.okeo!r.alcheem.orar-{plant}
f99r.P4.15;F okeeey.keey.keeor.okeey.daiin.okeol!s.aiin.olaiir.o!olshl-
f99r.P4.16;F qokeeo.okeey.qokeey.okisy.qokeeo.sar.sheseky.or.al-{plant}
f99r.P4.17;F **aiin.c!!!!khey.acthey.dy.daiin.okor.okeey.shcth!!!*!sh-
f99r.P4.18;H ychor.ols.or.am.air.om

The first thing I’d note is that, even though both herbal pages are marked up as “Herbal A” pages, their ciphertexts appear to have a completely different internal structure from each other. Specifically, f17v has lots of repetitive sequences such as “ychol chol dolcheey tchol ” / “chol chol cthol” / “okeor okar okol“, etc; while f96v has a different (dare I say more sophisticated?) feel altogether, with a nicely fluid use of letters. By way of further contrast, f99v is full of “ee” shapes such as “keey etheey” / “okeeey.keey.keeor.okeey” / “qokeeo.okeey.qokeey“, which looks clunky and repetitive in a quite different way from f17v.

The fact that all three text sequences accompany broadly the same diagram is surely some kind of indication that their contents could well be related in some way. However, there is (as far as I can see) no obvious textual overlap between the three of them. Hence I really don’t think the significant differences here can be accounted for purely in terms of presumed content. As a consequence, even though all three texts share the same glyphic building blocks, I think the precise ways the cipher system was employed in all three differ quite widely.

Unfortunately, this probably points to a weakness in the way we tend to talk about Voynichese: that we haven’t really established anything like a proper cryptographic ‘roadmap’ of the system’s evolution to help us navigate these differences with confidence. The page classifications we have inherited from Prescott Currier remain helpful in a fairly high-level sense, but I think our cryptanalytical needs have outstripped their low-level utility – they aren’t really strong enough tools to help us deal with the ciphertext itself.

And so my real Voynich research lead of the day is simply this: that I think we don’t yet know enough about the cryptanalytical differences between individual pages of Voynichese to be able to group /categorise / classify them effectively. What were the stages of evolution of the cipher system? What shapes or groups evolved into (or were replaced by) what? And why has it taken us more than a century to ask such basic questions?

Maybe, though, this is simply a consequence of the lack of detailed codicological insight we have into the original bifolio nesting and gathering layout (as well as composition order). If we had all that properly locked down, then perhaps we’d start to be more inquisitive about the changes going on in the cipher system, rather than just saying “it’s an A-page” or “it’s a B-page”.

Right now, looking at these three short sections, I have to say that it feels to me as if we still know next to nothing about how this cipher actually works.

Ulrik Heltoft’s “The Voynich Botanical Studies and The Origin of Specimen 52v” artworks will be at Andersen’s Contemporary art gallery in Copenhagen over the next few weeks (20.04.2013 to 11.05.2013), and I have to say that they’re really rather… eerie. But in a nice way!

Essentially, what Heltoft and his collaborator Miljohn Ruperto have done is recreate (after a fashion) a number of the Voynich Manuscript’s curious plant drawings. Their manipulated images were then fixed as large silver gelatin prints, lifting the Voynich’s unpindownable unworldliness (and indeed impracticability) to curious new heights. Having said that, I’m not sure what “The Origin of Specimen 52v” specifically refers to (apart from f52v itself, of course). Perhaps it will become obvious as photos of the installation start to appear on the Internet.

Anyway, here’s their 52r plant side by side with the Voynich’s f52r plant:

f52r-comparison-small

If you want to see some more, here’s a link to four pretty high-resolution Voynich Botanical Studies images.

But why did they do it? Well, according to this site

Ulrik Heltoft (b. 1973) graduated from The Royal Danish Academy of Fine Arts in 1999 and from Yale University in 2001. He is an associate professor of photography at The Royal Danish Academy of Fine Arts and has had solo exhibitions at Kirkhoff Contemporary Art, Raucci e Santamaria in Naples and Wilfried Lentz in Rotterdam. His works have also been shown at places such as Participants Inc., New Museum, Anthology Film Archive in New York, and the Hammer Museum in Los Angeles. Heltoft’s artistic activity is characterized by formally rigorous, technically perfect works in which minimal displacements suggest that “something else” is at play.

So basically, Heltoft is a Yale-graduated photography professor specializing in a rigorous-looking, false-historic aesthetic. Really, could there ever have been a flicker of a doubt in anyone’s mind that one day he’d ‘do’ the Voynich? Hmmm… maybe next he’ll do pages from its balneo section, but where every ‘nymph’ is the same model. Or perhaps instead he’ll move on to the Vinland Map? It’s always nice to have a Plan B, right? 😉

The Voynich Manuscript is a dismal reality TV channel, where every participant’s ten minutes of fame segues quickly into an eternity of opprobrium: few people dipping their feet into its toxic slurry get to keep all their toes for very long. I’m sorry to have to say such a thing, but as the modern philosophers Run and DMC put it, “it’s like that, and that’s the way it is“.

All of which is a long-winded way of saying that, somewhat surprisingly (at least to me), Gordon Rugg has this week returned to the Voynich’s rancid riverside with a fresh supply of podalic digits for dunking. But this time around he’s appropriating its mysteries not to promote the claimed benefits of his “Verifier Method” (a meme which seems not to have taken root), but to promote his newly-patented toy for 2013, that he somewhat grandly calls the Search Visualizer (rather as if he’s inventing a whole new field).

Yet unless I’ve misunderstood it significantly, all the Search Visualizer actually does is:
* draw a rectangle representing an input document
* draw dots on it wherever one of a user-defined set of syllables or words appears, with each dot a different colour.
Thus a SV user can, for example, map out that ‘witch’ and ‘sleep’ appear in different clusters within Macbeth. So far, so facile.

Nonetheless, I (perhaps) hear you ask eagerly, what can the Search Visualizer teach us about – dan dan darrrr – the Voynich Manuscript? Unsurprisingly (given the amount of exposure his Voynich claims gave the Verifier Method all those years ago), that’s the subject of this week’s blog post from him.

Having used SV to draw a lot of diagrams of (in the EVA transcription) “daiin”, “qo”, “dy”, and “ol”, Rugg concludes from the “banding” (basically, section structure) visible in those diagrams that…

It’s completely inconsistent with the theory that Voynichese is a single unidentified language, or with the theory that Voynichese consists of two dialects of a single unidentified language.

If we’re looking at dialects, then there are at least six of them, and some appear to be more different from each other than English is from German, at least on the preliminary results from my work so far (I looked at other German texts, and saw the same distribution patterns as in the book example above).

If we’re looking at a coded text, then there appear to be at least half a dozen different versions of the code, or at least half a dozen different codes producing similar but not identical types of text.

Of course, the main person who failed to grasp that the whole Currier-A-&-B-languages things wasn’t anything like a binary either-or (despite Rene & I telling him several times, as I recall) was, errrm, Gordon Rugg himself. So this is, unusually, a straw man argument where the straw man is the researcher himself (but 9 years in the past).

Anyway, even though his “Verifier Method” (in my opinion) falls well short of David Hackett Fisher’s splendid book “Historians’ Fallacies”, let’s apply it to the Search Visualizer:-

1. Accumulate knowledge of a discipline through interviews and reading.

I’ve read the article and most of his website, too. I’m an IT professional and a computer scientist. I can see what he’s doing: rectangles and coloured dots.

2. Determine whether critical expertise has yet to be applied in the field.

As far as the Voynich Manuscript goes, I don’t see any reference to:-
* codicology (though he’s added an addendum noting that the order of the pages may be wrong in “some cases”, this clearly isn’t reflected in his conclusions, which are almost entirely about the whole “banding” and “sub-banding” thing)
* Prescott Currier’s famous analysis (A pages, B pages, but plenty of intermediate ‘dialects’ too) isn’t mentioned once. That’s right, not once. Anywhere.
* statistical analyses carried out by researchers other than Gordon Rugg or his students.

Sorry, but that seems like a very uncritical, self-contained way of working.

I would add that I don’t see a lot of critical expertise being applied to historical cryptography: what instead appears seems to be a partial rendering of the history to support previously held positions.

3. Look for bias and mistakenly held assumptions in the research.

There’s plenty of bias towards his grille method, as well as naysaying against mainstream historical cryptography (which, let’s remember, he is trying to rewrite to support his particular story).

There’s also bias towards his 16th century dating in the face of fairly rock-solid scientific, art history, codicological, and palaeographic dating to the start/middle of the 15th century, which doesn’t really appear in his presentation.

4.Analyze jargon to uncover differing definitions of key terms.

What Rugg calls “Search Visualization” (drawing a rectangle of coloured dots) surely seems rather a low-grade kind of search. The point about “search” (in the Google sense) is surely that it finds things you didn’t previously know about and includes filters that promote relevance: whereas feeding pre-determined syllables and drawing coloured dots in a rectangle is only barely pattern-matching, and only barely visualization.

5. Check for classic mistakes using human-error tools.

* Using an outdated transcription
* Not filtering out all the embedded comments (is there a better explanation for this than sheer laziness?)
* Relying on computer science alone without integrating genuine historical research
* Arguing from possibility rather from probability or fact
* Not responding to criticism from actual domain experts

6. Follow the errors as they ripple through underlying assumptions.

(Too boring to do if so many mistakes have been identified in steps 1-5.)

7. Suggest new avenues for research that emerge from steps one through six.

Surely a proper academic would be building a tool that would find telling letter clusters for you from an input text, using Hidden Markov Models and all kinds of proper statistical mechanisms? Shouldn’t something like the Search Visualizer be about finding things you don’t already know about, and only then helping you visualize them?

All in all, I find it extraordinarily hard not to get cross about this, because Rugg seems to be exactly reprising what he did all those years ago, once again at the cost of the whole research area. And once again, his driving force appears to be “ask not what you can do for the Voynich Manuscript, ask what it can do for you.” Sad, very sad.

Over the last few years I’ve read (and indeed reviewed) plenty of Voynich-themed novels, and indeed have several queued up here I’m trying to steal enough time to read (e.g. Linda Lafferty’s The Bloodletter’s Daughter, etc).

So my default answer to the question “does the world need another Voynich-themed novel?” is normally “no, sorry, I don’t honestly think it does“. Even so, I have to say I’m looking forward to the English version of one just released in Spanish by Enrique Joven (disclaimer – whom I collaborated with on a Spanish history-of-the-telescope article back in 2008).

El Templo del Cielo

His previous book (The Book of God and Physics“, I never did like that title) was a Voynich novel set in modern times, but his new book “El templo del cielo” (i.e. “The Temple of the Sky”, though doubtless his publishers will rename it “The Book of Noodles and Zodiacs”, *sigh*) is set in the early 17th century. Hence it’s kind of a “Voynich prequel”. Errrm… except if he writes a further Voynich novel set in the fifteenth century, when I guess it would become a “Voynich postprequel”. Or (more likely) “book two of the trilogy”. 🙂

In real (i.e. non-novel-writing) life, Enrique is a professional astronomer in Tenerife, and so likes to build his books around ideas that define the history of astronomy. So what’s nice here is that because he has his (historically real) team of Jesuit missionaries (supposedly) take the Voynich Manuscript with them to China (along with the 7000 volumes they did genuinely take), his story should foreground many interesting aspects of the ups and downs of that whole historical sequence. In fact, when I discussed this little-known history here back in 2010, Enrique left a comment outlining what his novel would be about. So we can’t say he didn’t warn us! 😉

PS: here’s a link to Enrique’s blog.

USC’s irrepressible Kevin Knight and Dartmouth College Neukom Fellow Sravana Reddy will be giving a talk at Stanford on 13th March 2013 entitled “What We Know About the Voynich Manuscript“. Errm… which does sound uncannily like the (2010/2011) paper by the same two people called, errrm, let me see now, ah yes, “What We Know About the Voynich Manuscript“.

Obviously, it’s a title they like. 🙂

As I said to Klaus Schmeh at the Voynich pub meet (more on that another time), what really annoys me when statisticians apply their box of analytical tricks to the Voynich is that they almost always assume that whatever transcription they have to hand will be good enough. However, I strongly believe that the biggest problem we face precedes cryptanalysis – in short, we can’t yet parse what we’re seeing well enough to run genuinely useful statistical tests. That is, not only am I doubtful of the transcriptions themselves, I’m also very doubtful about how people sequentially step through them, assuming that the order they see in the transcription of the ciphertext is precisely the same order used in the plaintext.

So, it’s not even as if I’m particularly critical of the fact that Knight and Reddy are relying on an unbelievably outdated and clunky transcription (which they certainly were in 2010/2011), because my point would still stand regardless of whichever transcription they were using.

In fact, I’d say that the single biggest wall of naivety I run into when trying to discuss Voynichese with people who really should know better, is that hardly anyone grasps that the presence of steganography in the cipher system mix would throw a spanner (if not a whole box of spanners) in pretty much any neatly-constructed analytical machinery. Mis-parsing the text, whether in the transcription (of the shapes) and/or in the serialization (of the order of the instances), is a mistake you may well not be able to subsequently undo, however smart you are. You’re kind of folding outer noise into the inner signal, irrevocably mixing the covertext into the ciphertext.

Doubtless plenty of clever people are reading this and thinking that they’re far too smart to fall into such a simple trap, and that the devious stats genies they’ve relied on their whole professional lives will be able to fix up any such problem. Well, perhaps if I listed a whole load of places where I’m pretty sure I can see this happening, you’ll see the extent of the challenge you face when trying to parse Voynichese. Here goes…

(1) Space transposition cipher

Knight and Reddy are far from the first people to try to analyze Voynichese word lengths. However, this assumes that all spaces are genuine – that we’re looking at what modern cryptogram solvers call an “aristocrat” cipher (i.e. with genuine word divisions) rather than a “patristocrat” (with no useful word divisions). But what if some spaces are genuine and some are not? I’ve presented a fair amount of evidence in the past that at least some Voynichese spaces are fake, and so I doubt the universal validity and usefulness of just about every aggregate word-size statistical test performed to date.

Moreover, even if most of them are genuine, how wide does a ciphertext space have to be to constitute a plaintext space? And how should you parse multiple-i blocks or multiple-e blocks, vis-a-vis word lengths? It’s a really contentious area; and so ‘just assuming’ that the transcription you have to hand will be good enough for your purposes is actually far too hopeful. Really, you need to be rather more skeptical about what you’re dealing with if you are to end up with valid results.

(2) Deceptive first letters / vertical Neal keys

At the Voynich pub meet, Philip Neal announced an extremely neat result that I hadn’t previously noticed or heard of: that Voynichese words where the second letter is EVA ‘y’ (i.e. ‘9’) predominantly appear as the first word of a line. EVA ‘y’ occurs very often word-final, reasonably often word-initial (most notably in labels), but only rarely in the middle of a word, which makes this a troublesome result to account for in terms of straightforward ciphers.

And yet it sits extremely comfortably with the idea that the first letter of a line may be serving some other purpose – perhaps a null character, or (as both Philip and I have speculated, though admittedly he remains far less convinced than I am) a ‘vertical key’, i.e. a set of letters transposed from elsewhere in the line, paragraph or page, and moved there to remove “tells” from inside the main flow of the text.

(3) Horizontal Neal keys

Another very hard-to-explain observation that Philip Neal made some years ago is that many paragraphs contain a pair of matching gallows (typically single-leg gallows) about 2/3rds of the way across their topmost line: and that the Voynichese text between the pair often presents unusual patterns / characteristics. In fact, I’d suggest that “long” (stretched-out) single-leg gallows or “split” (extended) double-leg gallows could well be “cipher fossils”, other ways to delimit blocks of characters that were tried out in an early stage of the enciphering process, before the encipherer settled on the (far less visually obvious) trick of using pairs of single-leg gallows instead.

Incidentally, my strong suspicion remains that both horizontal and vertical Neal keys are the first “bundling-up” half of an on-page transposition cipher mechanism, and that the other “unbundling” half is formed by the double-leg gallows (EVA ‘t’ and ‘k’). That is to say, that tell-tale letters get moved from the text into horizontal and vertical key sequences, and replaced by EVA ‘t’ (probably horizontal key) or EVA ‘k’ (probably vertical key). I don’t claim to understand it 100%, but that would seem to be a pretty good stab at explaining at least some of the systematic oddness (such as “qokedy qokedy dal qokedy qokedy” etc) we do see.

Regardless of whether or not my hunch about this is right, transposition ciphers of precisely this kind of trickiness were loosely described by Alberti in his 1465 book (as part of his overall “literature review”), and I would argue that these ‘key’ sequences so closely resemble some kind of non-obvious transposition that you ignore them at your peril. Particularly if you’re running stats tests.

(4) Numbers hidden in aiv / aiiv / aiiiv scribal flourishes

This is a neat bit of Herbal-A steganography I noted in my 2006 book, which would require better scans to test properly (one day, one day). But if I’m right (and the actual value encoded in an ai[i][i]v group is entirely held in the scribal flourish of the ‘v’ (EVA ‘n’) at the end), then all the real content has been discarded during the transcription, and no amount of statistical processing will ever get that back, sorry. 🙁

(5) Continuation punctuation at end of line

As I noted last year, the use of the double-hyphen as a continuation punctuation character at the end of a line predated Gutenberg, and in fact was in use in the 13th century in France and much earlier in Hebrew manuscripts. And so there would seem to be ample reason to at least suspect that the EVA ‘am’ group we see at line-ends may well encipher such a double-hyphen. Yet even so, people continue to feed these line-ending curios into their stats, as if they were just the same as any other character. Maybe they are, but… maybe they aren’t.

Incidentally, if you analyze the average length of words in both Voynichese and printed works relative to their position on the line, you’ll find (as Elmar Vogt did) that the first word in a line is often slightly longer than other. There is a simple explanation for this in printed books: that short words can often be squeezed onto the end of the preceding line.

(6) Shorthand tokens – abbrevation, truncation

Personally, I’ve long suspected that several Voynichese glyphs encipher the equivalent of scribal shorthand marks: in particular, that mid-word ‘8’ enciphers contraction (‘contractio’) and word-final ‘9’ enciphers truncation (‘truncatio’) [though ‘8’ and ‘9’ in other positions very likely have other meanings]. I think it’s extraordinarily hard to account for the way that mid-word ‘8’ and word-final ‘9’ work in terms of normal letters: and so I believe the presence of shorthand to be a very pragmatic hypothesis to help explain what’s going on with these glyphs.

But if I’m even slightly right, this would be an entirely different category of plaintext from that which researchers such as Knight and Reddy have focused upon most… hence many of their working assumptions (as evidenced by the discussion in the 2010/2011 paper) would be just wrong.

(7) Verbose cipher

I’ve also long believed that many pairs of Voynichese letters (al / ol / ar / or / ee / eee / ch, plus also o+gallows and y-gallows pairs) encipher a single plaintext letter. This is a cipher hack that recurs in many 15th century ciphers I’ve seen (and so is completely in accord with the radiocarbon dating), but which would throw a very large spanner both in vowel-consonant search algorithm and in Hidden Markov Models (HMMs), both of which almost always rely on a flat (and ‘stateful’) input text to produce meaningful results. If these kinds of assumptions fail to be true, the usefulness of many such clever anaytical tools falls painfully close to zero.

(8) Word-initial ‘4o’

Since writing my book, I’ve become reasonably convinced that the common ‘4o’ [EVA ‘qo’] pair may well be nothing more complex than a steganographic way of writing ‘lo’ (i.e. ‘the’ in Italian), and then concealing its (often cryptologically tell-tale) presence by eliding it with the start of the following word. Hence ‘qokedy’ would actually be an elided version of “qo kedy”.

Moreover, I’m pretty sure that the shape “4o” was used as a shorthand sign for “quaestio” in 14th century Italian legal documents, before being appropriated by a fair few 15th century northern Italian ciphers (a category into which I happen to believe the Voynich falls). If even some of this is right, then we’re facing not just substitution ciphers, but also a mix of steganography and space transposition ciphers, all of which serves to make modern pure statistical analysis far less fruitful a toolbox than it would otherwise be for straightforward ciphers.

* * * * * * *

Personally, when I give talks, I always genuinely like to get interesting questions from the audience (rather than “hey dude, do you, like, think aliens wrote the Voynich?”, yet again, *sigh*). So if anyone reading this is going along to Knight & Reddy’s talk at Stanford and feels the urge to heckle ask interesting questions that get to the heart of what they’ve been doing, you might consider asking them things along the general lines of:

* what transcription they are using, and how reliable they think it is?
* whether they consider spaces to be consistently reliable, and/or if they worry about how to parse half-spaces?
* whether they’ve tested different hypotheses for irregularities with the first word on each line?
* whether they believe there is any evidence for or against the presence of transposition within a page or a paragraph?
* whether they have compared it not just with abjad and vowel-less texts, but also with Quattrocento scribally abbreviated texts?
* whether they have looked for steganography, and have tried to adapt their tests around different steganographic hypotheses?
* whether they have tried to model common letter pairs as composite tokens?

I wonder how Knight and Reddy would respond if they were asked any of the above? Maybe we’ll get to find out… 😉

Or you could just ask them if aliens wrote it, I’m sure they’ve got a good answer prepared for that by now. 🙂

The Daily Grail has today’s hot cipher history story: that Dan Brown’s soon-to-be-released novel “Inferno” is somehow based around the Voynich Manuscript. Apparently, the proof of this particular pudding is, well, a cipher, one apparently hidden in plain sight on Brown’s website:-

dan-brown-voynich-code

In Rolf Harris’ immortal phrase, “Can you tell what it is yet?” I hope you can, because all it is is… a 4×4 transposition cipher of “MS 408 YALE LIBRARY”. Yes, that’s it. Which is in itself a fairly underwhelming starting point, considering that the Voynich Manuscript isn’t MS 408 in “Yale Library”, but in Yale University’s Beinecke Rare Book & Manuscript Library. But (of course) that wouldn’t fit in 16 letters. 🙂

So, the story of the story is that Dan Brown will once again be wheeling out his “symbologist” Robert Langdon in a Renaissance-art-history-conspiracy-somehow-impinges-on-the-present-day-with-terrible-consequences schtick, but this time in Florence with Dante’s “Inferno” right at the heart of it (hence the title), with only the poor, much-abused Voynich Manuscript for company.

One part I’m not looking forward to is what Brown will have Robert Langdon make of the Voynich: for of all the mysteries I’ve ever seen, the Voynich is surely the least obviously symbol-laden. There’s no “sacred geometry” there, no gematria, no heresy, in fact no religion at all: just about all you could do is tie in the Voynich ‘nymphs’ with the same kind of alt.history “goddess” thing that Brown tried to stripmine in The Da Vinci Code… but all the same, that looks fairly hollow to me. I guess we’ll have to see what angle he does take… at least we won’t have long to wait (14th May 2013).

For me, the central contrast between Dante’s Inferno and the Voynich Manuscript is that they are diametrically opposite in referentiality: while the Inferno (and in fact the whole Divine Comedy) reaches out to touch and even include all of human culture, the Voynich Manuscript’s author seems to have worked with the same kind of monastic intensity to ensure it appears to refer to nothing at all. So, when Dan Brown collides the everything-book with the nothing-book, what kind of po-faced bathos-fest are we in for?

As an aside, I don’t see any numerology in the (original) Inferno: and considering the amount of effort Dante put into satirizing astrologers, alchemists, politicians, liars, frauds and the like in their aptly tortured circles of hell, I’m reasonably sure he’d mete out the same kind of punishment to numerologists. And probably to symbologists, too. And (if we’re lucky) to bad novelists… though you’ll have to put your own candidates forward for that, I’m far too polite. 😉

However, the bit I dread most is when people start to realize that Dante Alighieri’s Inferno was only the first part (of three) of his Divine Comedy: and with the current Hollywood craze for trilogies (The Hobbit trilogy, really?), what are the odds Dan Brown will extend any success with this book out into his own money$pinning Dante-based series, hmmm? The “Ka’chingferno” three-parter, no less!

Update: Erni Lillie upbraids me (and rightly so) in a comment here for omitting to mention his substantial 2004 (though the Wayback Machine only has a copy from 2007) Voynich Inferno essay, where he proposed that the nine “rosettes” on the Voynich Manuscript’s nine-rosette page might well represent the nine layers of Dante’s Inferno. My own experience of working on that particular page would place it closer to Purgatory, but perhaps we’re closer than medieval theologians would have it. 🙂

Truth be told, I did remember that I had forgotten something to do with Dante and the Voynich, but couldn’t for the life of me remember what it was I’d forgotten. And now that I’ve found it again, I was delighted to read it all over again, Renaissance warts and all. So, hoping that it’s OK with Erni to bring his work to a new generation of interested readers, here’s a link to a copy of his paper The Voynich Manuscript as an Illustrated Commentary of Dante’s Divine Comedy. Maybe it will turn out to be what Dan Brown’s new book plagiarizes was amply inspired by this time round, who knows? 😉

Personally, I suspect the smart money is indeed on Brown’s having the Voynich’s nine-rosette map turn out to be a map: with the devastating twist *yawn* that it actually represents a physical map of Dante-related locations in Florence, which Robert Langdon is then able to decode at speed thanks to his encyclopaedic knowledge of all things symbolic and Florentine, which ultimately leads him to the dark secret at the heart of a centuries-old conspiracy which he and his unexpected accomplice must choose whether to reveal to the horror of the world.

You know, basically the same as all his other books. 😉

Anyway, looking forward to the launch party at the Duomo, darling. Of course I’d like more olives, thanks for asking, and isn’t the San Giovese simply, errrm, Divine? 🙂

A paper came out a few days ago on arXiv.org, called “Probing the statistical properties of unknown texts: application to the Voynich Manuscript” written by three Brazilian academics (with assist from two German academics).

The authors grouped Voynichese (i.e. Voynich text) hypotheses into three broad categories:

“(i) A sequence of words without a meaningful message;
(ii) a meaningful text written originally in an existing language which was coded (and possibly encrypted) in the Voynich alphabet; and
(iii) a meaningful text written in an unknown (possibly constructed) language.”

After developing a whole load of word-occurrence-based statistical machinery (defining “intermittency”, etc) and applying them both to real text corpora and to Voynichese, they conclude that the word structure of Voynichese is incompatible with shuffled texts (which is how they model (i)-class hypotheses), and “mostly compatible with natural languages” (the (ii)- and (iii)-class hypotheses). They end up by using their statistical machinery to suggest Voynichese “keywords” – words that, according to their statistical measures, stand out from the text.

Their suggested English keywords (generated from the New Testament) are:-
* begat Pilates talents loaves Herod tares vineyard shall boat demons ve pay sabbath hear whosoever

Their suggested Voynichese keywords (generated from an EVA transcription, though they don’t say which, so possibly Takahashi’s?):-
* cthy qokeedy shedy qokain chor lkaiin qol lchedy sho qokaiin olkeedy qokal qotain dchor otedy

OK, but… what do I think? First off, I’m pleased to see that their results seem incompatible with “shuffled texts” or randomized texts, because that is what nearly all of the various Voynich “hoax” hypotheses rely on. Intuitively, just about anyone who has worked with Voynichese for any period of time is struck by its intense internal structuring on many levels: so it is nice to see the same result coming out from a different angle.

Secondly, what they mean by “mostly compatible” is that while Voynichese passes many of their proposed tests comfortably, it actually fails some of them (and only passes others by the slimmest of whiskers). To me, that implies either (a) an exotically- (and non-obviously-)structured language or constructed lanaguge, or (b) an obfuscated language (e.g. a ciphertext or shorthand): conversely, it seems to imply that Voynichese isn’t a one-to-one-map of any mainstream language (which is what cryptographers such as Elizebeth Friedman have been saying for years). Yet the earliest constructed language we currently know of was devised at least a century after the Voynich’s vellum dating (and about a century after its earliest marginalia), so we can almost certainly rule that possibility out.

I don’t know: while it’s always good to see people approaching the Voynich Manuscript from a new angle, I can’t help but feel that in just about every instance the Voynich’s author remains at least three or four steps ahead of them. The key paradox of Voynichese revolves around the fact that even though it so resembles a natural language, the way its words work as semantic units fails to do so in quite the same way. So for me, the important thing here is to try to understand the tests that failed, and see what they tell us about how Voynich words don’t work… but that will doubtless take a little time.

As for the suggested keywords: personally, I’d be rather more convinced by their statistical machinery if it had automagically suggested the word “Jesus” rather than “boat” or “vineyard” for the New Testament, so I have to say I’m far from persuaded that their list of Voynich cribs will help us unlock its secrets at all… but you never know, so perhaps let’s give them the benefit of the doubt on this one! 😉

Just the merest hint of a nudge to your collective set of virtual elbows, to remind you that the first Voynich London pub meet for basically ages is this evening (7th March 2013), at The Prospect of Whitby in Wapping. Though having said that, all cipher mysteries are fair game, not just the Voynich Manuscript: hence cipher pigeon fanciers and armchair treasure hunters are more than welcome to come along too. Plenty of room for everyone!

I’ll be there from 6.15pm or so, hoping to catch up on the latest Euro cipher gossip from Gotha and elsewhere, courtesy of Herr Cipher Skeptic himself, Klaus Schmeh, who’s on a flying visit to London having had a swift peek at the various enciphered books in the British Library (“The Subtlety of Witches”, etc). So if you can make your way to Wapping Wall for even half an hour, it would be really great to see you.

[Even stronger nudge: Tony Gaffney, what on earth do I have to do to persuade you to come along? I haven’t seen you in 25 years or so!]

Just so you know: if it’s a nice evening (or if someone happens to bring their dog along with them, John 🙂 ), the chances are we’ll be located in the terraced area through the pub to the back left (looking out over the Thames). Otherwise, we could be anywhere on the pub’s two floors, depending on how busy it happens to be. Looking forward to it!

It’s been a while, but the time has finally come round for another Voynich London pub meet, on Thursday 7th March 2013 at the Prospect of Whitby in Wapping, a pub with its own gallows and noose (though admittedly these days it’s Somali pirates who get all the press rather than privateers). I’ll be there from 6pm onwards, hope to see some of you there too!

prospect-of-whitby

The reason for the weekday (i.e. not the usual Sunday) is that German cipher mystery skeptic Klaus Schmeh is over in the UK for a very few days & the 7th is the only evening he can squeeze into his packed schedule. I can’t change that and would like to catch up with him, so what’s a Cipher Mysteries blogger to do? Make do with the cards he’s dealt, that’s what… it is what it is.

This has, of course, been Schmeh Week on Cipher Mysteries, what with The Gentlemen’s Cipher from Klaus’ blog and this week’s diplomatic cipher conference in Gotha. So if (like me) you’d like to chat with Klaus about the conference, or perhaps chat with me about cipher stuff (if reading all my posts isn’t a rich enough diet for you), then feel free to swing along to Wapping. WW2 cipher pigeon fans welcome too! Cheers! 🙂

I found out today that Slovakian publisher CAD Press late last year brought out a new facsimile edition of the Voynich Manuscript, preceded by 176 pages reviewing its history, apparent contents, mad theories, etc. Of course, reading Czech helps, though it contains plenty of other pictures (i.e. quite apart from Voynich imagery) should you wish to buy it as an unreadable coffee table book. 😉

voynichuv-rukopis

As far as I can tell, the author of the preface (Dr. Jitka Lenková, I believe?) seems to be hopeful that the manuscript’s origin will ultimately turn out to be somewhere in Bohemia. Well, I guess a bit of nationalist spin rarely goes amiss with your home audience: but such rhetoric would be a bit nicer if it were accompanied by a bit of, errrm, factuality to back it up, hmmm?

And no, I don’t really think the Voynich has anything to do with Jan z Lazu, about whom I’ve blogged a fair few times. Sorry again!