Diane O’Donovan has recently commented (here and elsewhere) and posted a number of times (on her own blog) about the priority of various Voynich ideas. For any given Voynich idea, who was first to mention, conceive, propose, argue, or even (puts tin hat on head and ducks) form it into a Voynich theory outlined in the TLS?

The immediate problem (obviously enough) is that 99% of Voynich ideas are groundless nonsense, homeopathically anchored on the sands of whimsical misreading, fanciful speculation, and over-optimistic just-so-ness. As a general category, then, it’s right up there with all the “pathology of cryptology” first outlined by David Kahn and more recently buffed by Klaus Schmeh in Cryptologia.

To be sure, Diane isn’t concerned with the priority of nutty Voynich ideas, such as the “diary of a stranded alien” notion, which every few days still manages to get reposted somewhere or other on the Internet. (And that is far from the nuttiest… so please excuse me if I don’t winch myself back down into the darkling pit containing the worst of the genre.)

Rather, she has formed a set of theories about the Voynich Manuscript which she believes to be both novel and true: and she is anxious/concerned to ensure that nobody should steal those ideas (i.e. by presenting them as commonplaces, or by passing them off as their own) and thereby deprive her of her ultimate Voynich research glory. As such, asserting priority has become an increasingly big concern of hers of late.

Well… I must confess that I do have a certain amount of sympathy for the desire to look back at what has been put forward in the past. However, even though I often feel the specific need to refer to D’Imperio’s index or to grep the archives of the old Voynich mailing list, for me this is only to try to gain a richer perspective on a particular topic, e.g. by looking at the conversations around it.

A significant part of the difference between her and me would therefore seem to be that I look backwards to try to place ideas in their context and by so doing to enrich my understanding of them; while Diane looks backwards to ensure that her ideas are genuinely hers, and that she hasn’t inadvertantly taken that which is someone else’s.

To the very greatest degree, then, priority is a non-issue for me, in that it is something that will get resolved (a) only once we can definitively decrypt Voynichese, and (b) by an entirely different kind of forensic historian (i.e. not by the people doing the research). Given that I see so few genuinely productive research paths being taken at the present time, the value of worrying about priority right now is surely inversely proportional to that of finding a rich new research furrow to plough.

Voynich priority, then, for the 99% of ideas out there that are complete bullshit, is surely an utter waste of time. And for the 1% of partially tenable ideas, it’s no more than very marginally better than that, and will make only sense once the plates have been cleared away after the big Voynich solution pizza party.

Anyone who hasn’t yet grasped that the solution to the Voynich will most likely fall squarely in the middle of the wide multi-dimensional chasms between our falteringly thin tendrils of historical and cryptological insight hasn’t been paying enough attention. That solution will most likely surprise us more by its curious proximity to the many sensible things that have been said about the manuscript, not by its distance from them.

When talking about the Zodiac Killer Z340 cipher, FBI cryptanalyst Dan Olson once pointed out that:

Statistical tests indicate a higher level of randomness by row, than by column. This indicates that the cipher is written horizontally and rules out any transposition patterns that are not strictly horizontal.

Here, while I’d agree with his observation part (the first sentence), I’m really not so sure about the conclusion part (the second sentence). And a little further on, Olson continues:

Row randomness of 408 is .22, 340 is .19. Column randomness of 408 is .48, 340 is .68. By way of comparison, row and column randomness should be near identical if the 340 does not contain any message, or if there is a message that is evenly scrambled.

This second time round, I’m comfortable with the observations here (the first two sentences), and mostly comfortable with Olson’s conclusion (the last sentence). However, I’d add that you have to be careful with his conclusion, because there is an implicit (but incorrect) follow-on conclusion lurking just beyond its limits for many readers: that if the cipher is not sequenced along columns, it must surely be primarily sequenced along rows of the text.

On the positive side, I would agree that we can conclude from this that we are not looking at a ‘pure’ periodic transposition cipher (i.e. one that rakes over the whole ciphertext, or even over the top or bottom halves). But what would it mean to assert that the Z340 is a bit more horizontal than vertical, though not as horizontal as the Z408?

An New Axis to Grind?

My (admittedly as-yet-hypothetical) explanation for all of the above is that what lurks behind is perhaps a short transposition cycle (i.e. no more than two or three elements long), where the elements are arranged across two or three consecutive lines, and where the end of each cycle steps back to the letter position immediately after the beginning of the cycle.

According to this, each ciphertext line would contain every second or third letter in the plaintext: for even though this would weaken the horizontal (row) adjacency patterning, it would not eliminate it. And statistically, this is essentially what we see: weakened horizontal patterning but no obvious vertical patterning. Because of the apparent groups of three lines (also noted by Olson), I suspect that these are arranged over three lines: and so this forms my primary hypothesis going forward.

A Quick JavaScript Test

I’ve posted up a quick JavaScript gist of what I’m talking about here: https://gist.github.com/anonymous/c53f88caf1dc6bd18a6bf6af45895b2c

The preliminary results of running this code fragment yields a different internal structure to each of the two halves (various intriguing results in bold):

Top half, first nine lines:
0: off2 = 3, off3 = 3, metric = 8
1: off2 = 2, off3 = 6, metric = 8
2: off2 = 2, off3 = 3, metric = 8
3: off2 = 0, off3 = 3, metric = 7

4: off2 = 3, off3 = 14, metric = 6
5: off2 = 1, off3 = 7, metric = 6
6: off2 = 0, off3 = 7, metric = 6
7: off2 = 3, off3 = 2, metric = 5
8: off2 = 2, off3 = 7, metric = 5
9: off2 = 2, off3 = 5, metric = 5

Bottom half, first nine lines:
0: off2 = 1, off3 = 0, metric = 10
1: off2 = 3, off3 = 11, metric = 9
2: off2 = 3, off3 = 10, metric = 9
3: off2 = 0, off3 = 4, metric = 9
4: off2 = 3, off3 = 15, metric = 8
5: off2 = 0, off3 = 8, metric = 8
6: off2 = 4, off3 = 8, metric = 7
7: off2 = 4, off3 = 4, metric = 7
8: off2 = 2, off3 = 15, metric = 7
9: off2 = 0, off3 = 10, metric = 7

Note that the period-19 (i.e. 17+2) effect is still slightly visible in the top half, but it’s much less apparent in the bottom half.

However, the most striking new pattern here is the (off2 = 1, off3 = 0) pattern in the bottom half, that yields ten pair matches in the untransposed text. This is the kind of zigzag transposition pattern one might expect of what Filippo Sinagra calls “peasant ciphers” – improvised amateur cryptographic tricks, that aim for security through obscurity.

Of course, I still have no idea whether or not I’m merely generating coincidences from the 17 x 17 x 2 = 578 permutations being examined here. But nonetheless it’s all quite interesting, right?

For a while, I got into the habit of picking up Voynich-related blog updates via voynich.ninja’s convenient Blogosphere Reader. However, I always knew that this was icing rather than cake (i.e. it wasn’t really the right way to do it), and that I should instead get round to configuring my own RSS feed reader to do all that kind of stuff directly. (I also wanted to be able to find a way of translating RSS feeds.)

The route that I would mainly be using to pick up RSS feeds was via my mobile on the morning or evening commute: and so I chose feedly.com, which works well both as a desktop site and as a mobile site. (There are plenty of other RSS readers besides Feedly, so feel free to choose whichever one works best for you.)

One nice thing about Feedly is that it allows you to export and import a list of feeds, which lets you share them easily with others: and so if you fancy setting up your own set of RSS feeds, here’s a an OPML file containing my current set of Cipher History RSS feeds for you to get started with.

To load this in Feedly…
* save the above OPML file onto your machine (i.e. “Save As…” from your browser)
* click on [+ Add Content] at the bottom left of Feedly’s interface
* click on the [Import OPML] item that pops up
* navigate to (and select) the OPML file you just saved out
…and off you go.

If you’ve logged in to Feedly via Google, the feeds you’ve added in your desktop should also automatically appear in Feedly on your mobile. Which is nice.

Translating RSS feeds

The above setup works well and straightforwardly for reading English RSS feeds in English. Sometimes, however, you may find yourself wanting to track blogs in other languages.

For example, despite having been to Hungary a couple of times (and liked the country very much), I know that I am fairly unlikely to suddenly acquire a desire to learn Hungarian (or probably any Finno-Ugric language, to be fair): and yet I would very much like to track Benedek Lang’s consistently interesting cipher history blog.

It used to be the case that you could use (the now discontinued) Google Reader app to achieve this extra translation step. However, there is (currently) a handy way of achieving the same basic result via Google Script, as described on Amit Agarwal’s Digital Inspiration website.

However, Amit’s script didn’t quite do what I wanted, so I updated the web app code to also route each RSS entry’s web link via Google Translate: here’s a link to my updated version of the Digital Inspiration script. The only practical problem is that (as you already know if you use Google Translate to translate web pages) the style sheet can often get lost in transit, which means that the translated page is often not as pretty as it probably should be: but at least it’s in the language you wanted.

If you have other cipher history blogs (English or non-English) you think ought to be on this list, please let me know, and I’ll add them to the list and update the OPML file.

I’ve had the Zodiac Killer Z340 cipher on my mind for the last few days. Though I’m still finding it hard not to draw the conclusion that its top and bottom halves are two different ciphertexts (joined together for reason(s) we can only hazily guess at), what has drawn so much of my attention is a quite different class of statistical observation: letter skips.

Letter Skips

The most (in)famous example of letter skips was the Bible Code, made famous by Michael Drosnin’s (1997) book The Bible Code. However, this was merely one in a long line claiming that the Bible is not only the literal and exact Word of God, but is also an implicit encipherment of all manner of unexpected occult statements and prophecies. To get to these secret messages, all you have to do is read every nth letter, modulo length(Bible): and then, if you hunt through the vast swathes of near-random junk that emerges from that, you’ll eventually discover words, phrases, and proper names that couldn’t possibly have been known millennia ago when the Bible was first written down.

There have been plenty of mathematical and statistical dismissals of the Bible Code, almost all of which reduce to the simple argument that if you search enough random letter sequences for long enough, you’ll find something that sort of looks like text. And so when Drosnin huffed that “When my critics find a message about the assassination of a prime minister encrypted in Moby Dick, I’ll believe them”, his critics took it literally as a challenge. As a result, we now have lists of numerous Drosnin-style letter-skip ‘predictions’ in Moby Dick, along with a ‘prediction’ of Princess Diana’s death [thanks to Brendan McKay].

From which the moral unavoidably seems to be: be careful what you wish for.

Generated Coincidences

At the heart of the Bible Code lies a simple sampling fallacy: which is that if you perform a long enough series of arbitrary statistical analyses on the text of any given document, you will (eventually) uncover things in it which superficially appear extraordinarily improbable.

This is directly relevant to a lot of the Zodiac Killer code-breaking discourse because, broadly speaking, it is exactly what has happened there: diligent statistical enquiry has yielded not only millions of strike-out tests, but also a large number of (superficially) unlikely-looking patterns. And so the question is: if you perform a hundred different statistical tests and one of them happens to yield a pattern that only appears in one in two hundred randomised versions of the same document, have you (a) found something fundamental and causal that could possibly explain everything, or (b) just generated a coincidence that means nothing?

Sadly, there is no obvious way of telling the difference: all one can do is nod sagely and say, in the words of a great 1970s philosopher…

…”COULD BE!

Transposition or “Tasoiin rnpsto”?

As should be plain as day from the above, I too view Bible Code letter skips as complete nonsense, and reserve my inalienable human right to cast a similarly cool eye over the impressive panoply of Zodiac Killer cipher observations, each of which may or may not be a generated coincidence.

Even so, utter disbelief of the specifics of the Bible Code shouldn’t mask the fact that the kind of statistical tests that are used for letter skips share a significant overlap with the kind of statistical tests that help reveal periodic ciphers and transposition ciphers.

Hence evidence of a letter-skip period in the Zodiac Killer Cipher should not be automatically put to one side because of the test’s association with hallucinatory Bible Code letter-skips, because evidence of a periodic effect could instead be pointing towards one of many other phenomena.

And there is indeed strong evidence of a period in play in the Z340, as first discussed by Daikon and Jarlve in 2015. Daikon examined the number of Z340 bigram repeats at different periods, and found a significant spike at period 19 (this really is noticeably larger than the other periods).

Here’s what these period-19 bigram repeats look like (was this diagram made by David Oranchak?):

Having then performed 1,000,000 random shuffles, David Oranchak concluded that this period-19 result had a “1 in 216” chance of happening. Which is good, but just a smidgeon short of great.

Incidentally, it’s easier to see these bigram matches if you rewrite Z340 in 19-wide columns (this diagram also probably made by David Oranchak):

More tests revealed all manner of similar periodic results that may or may not mean something: but I’m interested here specifically in the period-19 result.

Period-19? So what?

When he constructed the Z340, the Zodiac Killer had previously seen his Z408 cipher not only printed on the front page of newspapers (which surely pleased him), but also very publicly cracked (which surely displeased him). And yet his Z340 cipher closely resembles the Z408 in so many ways that it seems a fairly safe bet to me that his later cipher system was nothing more than a modification (a ‘delta’) of the earlier cipher system rather than something wildly different.

Hence I’ve long suspected that if we could somehow work out what the Zodiac Killer thought was technically wrong with the Z408 cipher system, then we could make a guess what his delta to the Z340 system might be.

Even though the Z408 presented all manner of homophone cycles, it wasn’t these that gave the game away to Donald Gene Harden and Bettye June Harden of Salinas. Rather, they made a number of shrewd psychological guesses (that the most likely first word a psychopath would write was “I”, and that the plaintext would include the word “KILL” multiple times), and used repetitions of “LL” as cribbed ways in to the message.

(As an aside, I struggle to believe that Bettye Harden genuinely guessed from scratch that the first three words of Z408 would be “I LIKE KILLING”, as has been reported. Instead, it seems far more likely to me that she had already worked for several hours on the cipher before making such an inspired guess.)

And so it seems most likely to me that the Zodiac Killer conceived his delta specifically as a way of disrupting the weakness of doubled letters (specifically doubled L), but without really affecting the rest of his code-making approach. And as always in cryptography, there are numerous ways this could be achieved:
* removing the second letter of all doubled letter pairs
* adding in new tokens for specific doubled letters (e.g. use ‘$’ to encipher ‘LL’)
* disrupt the order of the letters (i.e. transpose them) so that ILIKEKILLING becomes IIEILN LKKLIG etc

I’m therefore wondering if his cipher system delta was some kind of period-19 transposition. But – of course – people have already checked for the presence of straightforward period-19 transposition, and have basically drawn a blank. So if there is a period-19 ‘signature’ arising from some kind of transposition, it’s a little more complicated.

But if so, then what would it look like?

A three-way line dance?

My final piece of observational jigsaw in today’s reasoning chain is that the Z340 ciphertext is apparently arranged in groups of three lines. FBI cryptanalyst Dan Olson famously commented that…

Lines 1-3 and 11-13 contain a distinct higher level of randomness than lines 4-6 and 14-16. This appears to be intentional and indicates that lines 1-3 and 11-13 contain valid ciphertext whereas lines 4-6 and 14-16 may be fake.

…though note that this mixes up observation (the first sentence) with his best-guess inference (the second sentence). What I’m instead taking is that Olson’s observation more generally implies that lines are somehow grouped together in sets of three BUT with a spare line added in between the top and bottom half.

So, the overall line grouping sequence of the Z340 appears to be:
* top half: 1-1-1 2-2-2 3-3-3 X [a spare line with “cut marks” at either end of a fake line]
* bottom half: 4-4-4 5-5-5 6-6-6 X [a spare line with ‘ZODAIK’-like fake signature at the end]

Hence – putting it all together – I’m now wondering whether there is a period-19 transposition in play here BUT arranged in groups of three lines at a time. In which case, the symbol sequence for each set of three lines (3 x 17 = 51) might well look like this (where 01 is the first symbol of the plaintext, 02 is the second symbol, etc):

* 01 04 07 10 13 16 19 22 25 28 31 34 37 40 43 46 49
* 47 50 02 05 08 11 14 17 20 23 26 29 32 35 38 41 44
* 42 45 48 51 03 06 09 12 15 18 21 24 27 30 33 36 39

This transposition arrangement would yield both the period-19 effect and the groups-of-three-lines effect: and might also go some of the way towards explaining why lines 10 and 20 function differently to the other lines.

As I mentioned at the top of the post, I also strongly suspect that the top half of the Z340 and the bottom half of the Z340 are separate ciphertext systems, and so any solving should be attempted on the two halves individually, however inconvenient that may be. 🙂

I haven’t tested out this new transposition hypothesis yet: but it’s definitely worth a look, wouldn’t you think, hmmm?

Here’s a link to a nice little piece on the Voynich Manuscript that came out today in classy American online magazine Vox. Though clearly triggered by Nicholas Gibbs’ recent TLS non-theory, the article steers well clear of presenting it (or indeed anything else) as a Voynich solution or explanation – and praise be to the Cipher Gods On High for that small mercy.

Unusually, Vox’s mission – to engage with newsworthy subjects and explain them really clearly – is almost Reithian. In these online days (where journalism so often ends up thinner than the paper it’s printed on), this is an approach that’s so brutally old-fashioned it’s close to subversive. Whatever next – shock jocks touching on deeper truths that no-one dare name?

It may sound a little shallow, but I was actually very pleased to find my words used to close the article: that “The evil beauty of the Voynich manuscript […] is that it holds a mirror up to our souls”, i.e. all the while the Voynich Manuscript’s secrets remains uncracked, it seems we will always have to endure people peering into its haruspicious sheen and seeing exactly what they want to find. Oh well! 😐

I’ve had a dissatisfying, rubbish day today: but given that every day I’ve previously had that involved some kind of interaction with Stephen Bax had been a bad day, perhaps there should be no element of surprise involved.

Bax To The Future

In the case of the Voynich Manuscript, there are at least ten reasonable arguments I can see (even if I happen to disagree with all of them) for a linguistic reading: what frustrates me so much about Bax is that the arguments he puts forward aren’t any of them (or even close to them). Hence he inevitably finds the best form of defence is attack: and given that I’m just about the only person not fawning over him, guess who gets attacked?

Frankly, I’d rather stick flaming needles under my fingernails than experience any more of his wit, wisdom, and whatever in the absence of any effective moderation: so goodbye to voynich.ninja it has to be, sorry.

Doubt With The Old

Of course, Bax himself isn’t the root cause of all this: the real problem is that almost all genuine Voynich experts seem to doubt the depth of their confidence in what they know, and so choose silence over confrontation, no matter how foolish the provocation or how malformed the argument.

Yet even though I’ve been saying for over a decade that we now do know enough to take a principled stand against Voynich pseudoscience, pseudohistory, pseudolinguistics and enigmatology, it’s been a long wait for anyone to show any kind of solidarity with this point of view.

I therefore note with great interest that Rene Zandbergen has recently – after a decade of Rich SantaColoma’s incessant possibility-based argumentation – put up a page dismissing the modern hoax theory. This is, in my opinion, a huge milestone in Voynich discourse: but whether Rene or others will follow up with similarly comprehensive rebuttals of Rugg, Bax et al remains to be seen. When all you can see are vipers, where’s theriac when you really need it?

Dead Drunk On The Beach?

Cipher Mysteries readers will probably see a lot in common between the above and what passes for debate in The Somerton Man world. Even the straightforward disproof of the whole microwriting claim seems to have been overlooked by all the loudest shorts at the poolside: so please excuse me if I sip my Camilla Voodoo elsewhere.

Anyhoo, given that most historical-cipher-inspired songs seem happy to look no further than the Voynich’s surreality, today’s aural treat-ette for you all is a song from South London’s own JerkCurb called “Somerton Beach” (review here), where the wobbly guitars try to capture a kind of alcoholic pre-death haze. Which is nice, if oddly apposite, though I couldn’t easily explain why.

Here’s the evidence that the Zodiac Killer is alive and busy with a spray can in Cyprus, visual documentary evidence to which only the most obtuse could possibly object:

And if you think that’s the most ridiculous and/or foolish cipher theory you’ve encountered in the last seven days, you obviously haven’t been paying much attention. 🙁

In August 2016, I spent a day at the British Library trawling through many of its palaeography books (as I described here). What I was specifically in search of was examples of handwriting that matched the handwriting in the Voynich Manuscript, along with its marginalia.

As mentioned before, the document I found was Basel University Library A X 132: it’s a Sammelband (anthology or collection), with sections copied from a number of different medieval authors. The section I was most interested in (dated 1465) was fol. 83r through to fol. 101r.

With a little help from Stefan Mathys (thanks, Stefan!), I ordered some pages, along with some from the start and end of other sections, just in case the same scribal hand reappeared and included a little biographical information about that scribe. I’ve just begun writing this up as a paper (heaven knows that so little of any authority has been written about the Voynich, so I want to do this properly): but as I was going through, I noticed something interesting that I thought I’d share separately.

One of the extra sections I asked for began on f202r: and I must admit to being surprised to see an oddly familiar piece of marginalia there. Recalling the tiny marginalia at the top of the Voynich Manuscript’s page f17r…

…now look at the tiny marginalia at the top of A X 132’s fol. 202r, a “vocabularij hebreicus et grecus” (according to this):

The listing remarks that f202r is covered in “Stegmüller, Rep.bibl.6,93 Nr.8665”, i.e. Friedrich Stegmüller, Repertorium biblicum medii aevi, 11 vols. (I don’t believe that volume 6 is online, but please let me know if you manage to find a copy.)

Have you found any better matches than this?

A tip of my monkey’s uncle’s Susquehanna hat to Derek Abbott for today’s cipher history link: a new Voynich theory by Nicholas Gibbs in the Times Literary Supplement. Gibbs explains the circumstances that brought him to the Voynich Manuscript:

I am also a muralist and war artist with an understanding of the workings of picture narration, an advantage I was able to capitalize on for my research. A chance remark just over three years ago brought me a com­mission from a television production company to analyse the illustrations of the Voynich manuscript and examine the commentators’ theories.

however… all the descriptive part of his solution seems to have been culled from those parts of commentators’ reading lists that caught his eye, but then vaguely linked together into a sort of fairly unconvincing-sounding narrative. The only linguistically technical part of his “solution” in the TLS is given in tiny letters in the following image, which you can make out if you click on it and squint:

Note that the image is marked “p16_Gibbs1.jpg”: which seems to imply we have a book to (sort of) look forward to. Errrm… hooray.

I could list a whole load of things that are wrong with this, but I’d be typing all night on a TL;DR post and nobody would care. *sigh*