A new day, and a new episode of “The Hunt for the Zodiac Killer” (S01 E02, in programme guide speak) to trawl through. Luckily, though, this week’s episode proved to be fairly lightweight:

video since removed from TagTele site

Increasingly Uncomfortable

I’m sorry to have to say it, but as this series goes on, I’m getting less and less comfortable with the presentation. Televisually, the editing conceit has been to make it look as though all the evidence is being considered and discovered for the first time (and sort of ‘in real time’), but just about everything that pops up (hey, what magic beans has CARMEL found this time?) has previously been floated, shot down, raked over, partially resurrected and left hanging in a kind of evidential limbo ten or twenty times over. And so it’s more than a little bit grating to see so many creaky old ideas being dredged up and presented as if they were not only new, but also generated by a piece of software.

So, for all the potential cleverness of the software toolkit, CARMEL has been largely reduced here to a panto linking device, not a million miles from “And now for something completely different: a man with a tape recorder up his nose”. (Which, tape recorder reference aside, originally came from Blue Peter’s Christopher Trace, UK TV buffs might be interested to know.)

The connection the programme makers float between TWICH/TWICHED and SQUIRM/SQWIRM in the Cheri Jo Bates typed “Confession” letter and the Zodiac’s 1970 letter is interesting (of course, Michael Butterfield discussed this in 2009, though doubtless it was already old news by then). But in every other sense the Confession letter comes across to me as a crock, a simulation of a confession letter that says nothing actually new. And though two of the three notes that arrived six months later said that “BATES HAD TO DIE THERE WILL BE MORE Z”, these also come across to me as fakes, or rather someone simulating nuttiness.

At the same time, according to this Quester Files website page, “[t]he envelopes carried double postage. This is something the Zodiac did.” And moreover, “Sherwood Morrill, the Questioned Document’s examiner in Sacramento, examined the envelopes and writing. He said it was indeed The Zodiac’s handwriting.”

Putting all this together, even though I could comfortably accept the idea that the 1966 Cheri Jo Bates “Confession” letter and even the three (somewhat belated) nutty-looking handwritten notes were by the same person who would later become the Zodiac Killer, I would struggle to accept without any obvious supporting evidence the claim that he also killed Cheri Jo Bates, in the unquestioning way the programme makers seem to think their audience should. To me, it seems far safer to conclude that in 1966 Zodiac was instead merely fantasizing about killing, and that he instead wrote the suite of letters as a kind of performative role play theatre, projecting his incipient psychopathy onto the stabby, bloody, horrible backdrop of some other properly mad person’s crime (which was very probably driven by enraged sexual inadequacy and/or spurned passion).

Lone Wolf, or Sea Wolf?

As a side note, the question of the origin of the Zodiac and his ‘cross-hair’ symbol comes up again and again: an obvious issue is that his chosen symbol is not in any way connected with astrological or zodiacal symbols, which is also true of the (for the most part letter, reflected letter, and part-filled geometrical) shapes he uses in his ciphertexts. So, then: why ‘Zodiac’?

However, I have to say that the answer to this question seems to me to be very simple and indeed painfully obvious (though it has of course previously been pointed out a thousand times or more): that the most likely inspiration for the Zodiac’s symbol and name was the Zodiac Watch Company, which had during the 1950s achieved great renown with their Sea Wolf diving watch (and very similar symbol).

Note that the Zodiac Sea Wolf watch had (of course it did) a movable bezel, much as per the Mt Diablo note: which is not anything like proof, of course, but it’s definitely something to think about. (As a further aside, perhaps a watch historian might like to tell us which movable watch bezels of the 1960s had 0 / 3 / 6 / 9 markings on them.)

All of which finally spins this post back round again to poor Cheri Jo Bates’ murder: for, as the zodiackillerfacts site points out, it was there that “[I]nvestigators came upon a man’s Timex watch lying on the ground near the body”. Honestly, does anyone truly believe that a psychopath who specifically named himself after a macho Swiss watch brand would be seen dead – if you’ll pardon the phrase – in an American Timex?

What journalists the world over love to do is to bring two vaguely related things together to review or discuss, as if by doing so they identify an incipient trend or fashion. Once pointed out, of course, this trick reeks of self-indulgent modern tossery: but in the scientific interest of exploring different ways of writing about unsolved historical ciphers, I thought it might at least be a little interesting to try on this dropped shoe, to see for myself whether it’s fur or glass.

So here goes.

Jess Feldman’s “Call It a Premonition”

Jess Feldman’s slim book of poetry “Call It a Premonition” eerily subtitles itself “Translations from the Voynich Manuscript”.

When I tweeted Jess to ask how her set of poems related to the Voynich Manuscript, she replied:

I didn’t base the poems on any specific pages. Just created the poems by looking at the manuscript holistically and imagining it as the diary of a 13 year old girl. Even though it’s a different time period, I was influenced by Hilary Mantel’s Wolf Hall.

Is it any good? Well, if you want you can read the 18-page PDF here and decide for yourself (and let’s face it, it’s not going to take long). For me, though, there are some obvious highlights: and with my unsolved historical cipher hat on, I certainly couldn’t read the line in “Another Dead Fellow”…

Look:
I’ve a shiny new hairpiece. See, cuz: a mermaid

…without thinking of the curious fish/mermaid hybrid that appears on f79v in Quire 13:

Similarly, the line in “I Dreamed Of”…

a leashed hart young doe collared in soft pink bouclé
the king’s own grazing animals

…brought to my mind the strangely red-painted animal on the same page (note where the heavy painter has smooshed green paint all over the lines, so that we can’t easily discern the outlines that were originally drawn here):

Finally, “The Summer of 1438” will surely ring true for some long-suffering Cipher Mysteries readers:

I’m learning
Sometimes
Unsolved mysteries
It’s better not to ask

My favourite single poem from the set, however, is her eponymous “Call It a Premonition”, which is last but certainly not least:

In a distant future
I look like a boy only
still a girl but in slacks okay
I am moving my lacquered
fingers so text materializes
across a framed fluid tapestry
without seams Without origin
unmarried far from dead
childless and leather-booted
I say words like Co-workers,
I hate this fucking job and
I mean it but I am happy
enough I think
comparatively speaking
nonetheless

To me, this kicks an angry skateboard shoe at our foolish modernity’s ankle in a way that stings both kicker and kickee. I enjoyed it, and hope for more good Feldmany stuff in the future, slacks or otherwise. 😉

The poetry of “Supercomputer CARMEL”

In many ways, all that ackshwall poetterie couldn’t really sit further from our second contender in the ring tonight. Championed by CompSci historical cipher buff Professor Kevin Knight, CARMEL is a supercomputer running AI pattern-recognition software (according to, wait, the History Channel press release is round here somewhere) that he filled with all manner of documents and items to do with the Zodiac Killer Cipher, as part of the Hunt For The Zodiac TV documentary series that started a few days ago.

Well… that’s basically what the History Channel (and the programme makers at Karga Seven) would like us to think. However, despite all the nicely-lit close-ups of charcoal grey racks studded with the obligatory array of blinking LEDs, CARMEL isn’t a “supercomputer” at all, it’s a laboratory teaching support code toolkit developed (in C++ with a bit of Boost) and extensively tweaked over the last twenty years at the Information Sciences Institute at USC where Knight is a professor. Strictly speaking, it’s a “finite-state transducer package written by Jonathan Graehl”, that basically lets you set up a whole load of linked and/or nested finite state machines and then do all manner of clever things with them, such as train them (i.e. condition the transition weights) just by passing data into them, via either “EM (expectation-maximization) training” or “Bayesian Chinese Restaurant Process training”.

In fact, ISI students (and doubtless many others) have wheeled out the CARMEL toolkit over that period to write all manner of experimental NLP (natural language processing) software, from cipher solvers – though if you have to repeat anything “trillions of times” to get a result, something’s probably going horribly wrong with your choice of algorithm – to accelerated knowledge acquisition to (yes, you guessed it) synthetic poetry generation.

The wonderfully creepy thing is that by seeding the (CARMEL toolkit assisted) poetry generator specifically with the corpus of letters written by the Zodiac Killer, you can – with a smidgeon of justification – say that you’re generating Zodiac Killer (-seeded) poetry. It’s by no means a simulated look into his head, sure, but rather a rhyming sampling plucked from the linguistic soup apparently (on the evidence of the contents of his letters, anyway) swilling around there. Sort of.

But is this artificial poetry any good? Well, (with the caveat that I don’t know the settings or the precise keywords used to seed the particular runs) I can do no better than quote some examples given on the web and let you decide for yourself.

Existence enters your entire nation.
A twisted mind reveals becoming manic,
An endless modern ending medication,
Another rotten soul becomes dynamic.

Or under pressure on genetic tests.
Surrounded by controlling my depression,
And only human torture never rests,
Or maybe you expect an easy lesson…

Handily, you can also generate your own here. I tried out a few for myself (the title of each is the keyword I chose), each took no more than a couple of seconds to pop up:

BRUTAL

Another day without a life abused.
Confused or blinded by the nitty gritty,
Afraid of being openly accused,
You and me against the world severely.

Or this:

PARADISE

Something like an angel on the vine!
An endless dream about eternal youth,
A power full of blood and sugar pine,
A sigh of love and see the naked truth.

OK, it’s not hard to see that these are simply terrible poems, mashing up pukable birthday card rhymes with a madman’s self-loathing and anger. But perhaps some entrepreneur will now use this to corner the market in psychopathic greetings cards (though perhaps even modern society has a little way further to descend before that really hits the zeitgeisty spot).

Hegelian synthesis?

This is of course the point where journalists, happy to have simulated catching a trend (or created a simulacrum of catching a trend, depending on how picky you are), press [Send], kick off their loafers, and inject some ridiculous Class A drug (or whatever journos prefer these days). “It’s a wrap!” as I once shouted angrily (having actually ordered a sandwich).

But having thought about it for a few days, I think that a common thread subtly links the two sets of poetry. In the case of Jess Feldman, she imagined herself into the inner mental world of “a 13 year old girl” living in a brutal historical period: while the Zodiac Killer text corpus also transports us (via the CompSci magic of trained finite-state transducers) into the stage of his broken psychodrama. Yet for me, the challenging link between the two is not the brutality of the two worlds (by which I mean the girl’s outer world and the Zodiac’s inner world) but the two sets’ shared performative aspect.

Why? Well: in my opinion, Jess Feldman’s poetry is – I think, but feel free to have your own opinion – written as performance pieces: the poems are not hurty / shouty / broken / combat wordplay, but stuff that could comfortably be performed: their language is consistently wry and agile, half-spoken thoughts in the grey area between what we say and what we think.

Similarly, even though the Zodiac Killer synthetic poetry is essentially an experimental linguistic simulation, it is necessarily anchored within the inherently narcissistic and manipulative language he employed for effect in his letters: and whether you like it or not, I think it is the raw language that constrains the computer’s output far more than the rhyme scheme.

Without any real doubt, the Zodiac Killer’s letters (and indeed his ciphertexts) were constructed solely for the practical needs of his hateful theatre and his need to control by terror, not for communication. The notion that we might ever actually learn something so banal as his name from any of his ciphertexts (even the Z13) seems to me ridiculous and pathetically needy: what he wrote was never a diary, but a performance that’s both as fake as Reality TV and as psychotic as Charles Manson.

The History Channel’s new Karga Seven-produced series “The Hunt for the Zodiac Killer” has just started: it features top Zodiac Killer researcher Dave Oranchak, Copiale Cipher cracker (and Voynich Manuscript fanboy) Kevin Knight, long-time Zodiac researcher (and University of North Texas CompSci professor) Ryan Garlick, Cryptologia editor (and author of “Unsolved!”) Craig Bauer, and a Google engineer guy called Sujith Razi who I’ve never heard of. But I’m sure is a lovely bloke.

The film makers also have a retired homicide cop and a cold case searcher both talking to people on the ground and raking over hitherto unseen archives. Overall, the televisual conceit they try to sustain is that the whole process is unfolding right before our eyes in sort-of real time, which is a decent enough fiction to structure this kind of thing by.

The first episode tries to link the Zodiac Killer with the unsolved 1966 Riverside murder of 18-year-old Cheri Jo Bates, a connection that has been floated (yet also denied) many times. The suspect they quickly move to is a certain Ross Sullivan, who was a student who also worked in the library where Bates was studying the evening of her death: Sullivan took a cryptography class, wore army boots, disappeared the day after Bates’ death, came back with fresh clothes a few weeks later, etc etc etc.

The programme alludes to DNA comparisons, to reconstructing faces purely from DNA, and to cracking some part of one the Zodiac Killer’s ciphers in the following episodes, but we’ll have to see which of these promises they keep. If all they ultimately serve up on their delightful silver platter is Craig Bauer’s ALFREDENEUMAN hopeful crack of the Z13 cipher (whose first three cipher letters are also “AEN”), I’m not 100% certain the cryptological ticker tape and marching elephants parade will be on duty that day. But, as always, we shall see.

Incidentally, my personal favourite of the homophonic solutions for the Z13 listed by Dave Oranchak are “Sarah The Horse”, “All Banana Alan”, and “Trove Behemoth”. Ever since I read that, I’ve been deeply distrustful of anyone called Sarah The Horse: I therefore advise all Cipher Mysteries readers to keep some sugar cubes in a convenient pocket, just in case this particular worst-case scenario is correct. What, me worry?

The Hunt for the Zodiac Killer

Anyway, I’m not sure how long the following embedded video will stay live for, but it’s courtesy of well-known site Tagtélé:

video since removed from TagTele site

Enjoy! 🙂

When talking about the Zodiac Killer Z340 cipher, FBI cryptanalyst Dan Olson once pointed out that:

Statistical tests indicate a higher level of randomness by row, than by column. This indicates that the cipher is written horizontally and rules out any transposition patterns that are not strictly horizontal.

Here, while I’d agree with his observation part (the first sentence), I’m really not so sure about the conclusion part (the second sentence). And a little further on, Olson continues:

Row randomness of 408 is .22, 340 is .19. Column randomness of 408 is .48, 340 is .68. By way of comparison, row and column randomness should be near identical if the 340 does not contain any message, or if there is a message that is evenly scrambled.

This second time round, I’m comfortable with the observations here (the first two sentences), and mostly comfortable with Olson’s conclusion (the last sentence). However, I’d add that you have to be careful with his conclusion, because there is an implicit (but incorrect) follow-on conclusion lurking just beyond its limits for many readers: that if the cipher is not sequenced along columns, it must surely be primarily sequenced along rows of the text.

On the positive side, I would agree that we can conclude from this that we are not looking at a ‘pure’ periodic transposition cipher (i.e. one that rakes over the whole ciphertext, or even over the top or bottom halves). But what would it mean to assert that the Z340 is a bit more horizontal than vertical, though not as horizontal as the Z408?

An New Axis to Grind?

My (admittedly as-yet-hypothetical) explanation for all of the above is that what lurks behind is perhaps a short transposition cycle (i.e. no more than two or three elements long), where the elements are arranged across two or three consecutive lines, and where the end of each cycle steps back to the letter position immediately after the beginning of the cycle.

According to this, each ciphertext line would contain every second or third letter in the plaintext: for even though this would weaken the horizontal (row) adjacency patterning, it would not eliminate it. And statistically, this is essentially what we see: weakened horizontal patterning but no obvious vertical patterning. Because of the apparent groups of three lines (also noted by Olson), I suspect that these are arranged over three lines: and so this forms my primary hypothesis going forward.

A Quick JavaScript Test

I’ve posted up a quick JavaScript gist of what I’m talking about here: https://gist.github.com/anonymous/c53f88caf1dc6bd18a6bf6af45895b2c

The preliminary results of running this code fragment yields a different internal structure to each of the two halves (various intriguing results in bold):

Top half, first nine lines:
0: off2 = 3, off3 = 3, metric = 8
1: off2 = 2, off3 = 6, metric = 8
2: off2 = 2, off3 = 3, metric = 8
3: off2 = 0, off3 = 3, metric = 7

4: off2 = 3, off3 = 14, metric = 6
5: off2 = 1, off3 = 7, metric = 6
6: off2 = 0, off3 = 7, metric = 6
7: off2 = 3, off3 = 2, metric = 5
8: off2 = 2, off3 = 7, metric = 5
9: off2 = 2, off3 = 5, metric = 5

Bottom half, first nine lines:
0: off2 = 1, off3 = 0, metric = 10
1: off2 = 3, off3 = 11, metric = 9
2: off2 = 3, off3 = 10, metric = 9
3: off2 = 0, off3 = 4, metric = 9
4: off2 = 3, off3 = 15, metric = 8
5: off2 = 0, off3 = 8, metric = 8
6: off2 = 4, off3 = 8, metric = 7
7: off2 = 4, off3 = 4, metric = 7
8: off2 = 2, off3 = 15, metric = 7
9: off2 = 0, off3 = 10, metric = 7

Note that the period-19 (i.e. 17+2) effect is still slightly visible in the top half, but it’s much less apparent in the bottom half.

However, the most striking new pattern here is the (off2 = 1, off3 = 0) pattern in the bottom half, that yields ten pair matches in the untransposed text. This is the kind of zigzag transposition pattern one might expect of what Filippo Sinagra calls “peasant ciphers” – improvised amateur cryptographic tricks, that aim for security through obscurity.

Of course, I still have no idea whether or not I’m merely generating coincidences from the 17 x 17 x 2 = 578 permutations being examined here. But nonetheless it’s all quite interesting, right?

I’ve had the Zodiac Killer Z340 cipher on my mind for the last few days. Though I’m still finding it hard not to draw the conclusion that its top and bottom halves are two different ciphertexts (joined together for reason(s) we can only hazily guess at), what has drawn so much of my attention is a quite different class of statistical observation: letter skips.

Letter Skips

The most (in)famous example of letter skips was the Bible Code, made famous by Michael Drosnin’s (1997) book The Bible Code. However, this was merely one in a long line claiming that the Bible is not only the literal and exact Word of God, but is also an implicit encipherment of all manner of unexpected occult statements and prophecies. To get to these secret messages, all you have to do is read every nth letter, modulo length(Bible): and then, if you hunt through the vast swathes of near-random junk that emerges from that, you’ll eventually discover words, phrases, and proper names that couldn’t possibly have been known millennia ago when the Bible was first written down.

There have been plenty of mathematical and statistical dismissals of the Bible Code, almost all of which reduce to the simple argument that if you search enough random letter sequences for long enough, you’ll find something that sort of looks like text. And so when Drosnin huffed that “When my critics find a message about the assassination of a prime minister encrypted in Moby Dick, I’ll believe them”, his critics took it literally as a challenge. As a result, we now have lists of numerous Drosnin-style letter-skip ‘predictions’ in Moby Dick, along with a ‘prediction’ of Princess Diana’s death [thanks to Brendan McKay].

From which the moral unavoidably seems to be: be careful what you wish for.

Generated Coincidences

At the heart of the Bible Code lies a simple sampling fallacy: which is that if you perform a long enough series of arbitrary statistical analyses on the text of any given document, you will (eventually) uncover things in it which superficially appear extraordinarily improbable.

This is directly relevant to a lot of the Zodiac Killer code-breaking discourse because, broadly speaking, it is exactly what has happened there: diligent statistical enquiry has yielded not only millions of strike-out tests, but also a large number of (superficially) unlikely-looking patterns. And so the question is: if you perform a hundred different statistical tests and one of them happens to yield a pattern that only appears in one in two hundred randomised versions of the same document, have you (a) found something fundamental and causal that could possibly explain everything, or (b) just generated a coincidence that means nothing?

Sadly, there is no obvious way of telling the difference: all one can do is nod sagely and say, in the words of a great 1970s philosopher…

…”COULD BE!

Transposition or “Tasoiin rnpsto”?

As should be plain as day from the above, I too view Bible Code letter skips as complete nonsense, and reserve my inalienable human right to cast a similarly cool eye over the impressive panoply of Zodiac Killer cipher observations, each of which may or may not be a generated coincidence.

Even so, utter disbelief of the specifics of the Bible Code shouldn’t mask the fact that the kind of statistical tests that are used for letter skips share a significant overlap with the kind of statistical tests that help reveal periodic ciphers and transposition ciphers.

Hence evidence of a letter-skip period in the Zodiac Killer Cipher should not be automatically put to one side because of the test’s association with hallucinatory Bible Code letter-skips, because evidence of a periodic effect could instead be pointing towards one of many other phenomena.

And there is indeed strong evidence of a period in play in the Z340, as first discussed by Daikon and Jarlve in 2015. Daikon examined the number of Z340 bigram repeats at different periods, and found a significant spike at period 19 (this really is noticeably larger than the other periods).

Here’s what these period-19 bigram repeats look like (was this diagram made by David Oranchak?):

Having then performed 1,000,000 random shuffles, David Oranchak concluded that this period-19 result had a “1 in 216” chance of happening. Which is good, but just a smidgeon short of great.

Incidentally, it’s easier to see these bigram matches if you rewrite Z340 in 19-wide columns (this diagram also probably made by David Oranchak):

More tests revealed all manner of similar periodic results that may or may not mean something: but I’m interested here specifically in the period-19 result.

Period-19? So what?

When he constructed the Z340, the Zodiac Killer had previously seen his Z408 cipher not only printed on the front page of newspapers (which surely pleased him), but also very publicly cracked (which surely displeased him). And yet his Z340 cipher closely resembles the Z408 in so many ways that it seems a fairly safe bet to me that his later cipher system was nothing more than a modification (a ‘delta’) of the earlier cipher system rather than something wildly different.

Hence I’ve long suspected that if we could somehow work out what the Zodiac Killer thought was technically wrong with the Z408 cipher system, then we could make a guess what his delta to the Z340 system might be.

Even though the Z408 presented all manner of homophone cycles, it wasn’t these that gave the game away to Donald Gene Harden and Bettye June Harden of Salinas. Rather, they made a number of shrewd psychological guesses (that the most likely first word a psychopath would write was “I”, and that the plaintext would include the word “KILL” multiple times), and used repetitions of “LL” as cribbed ways in to the message.

(As an aside, I struggle to believe that Bettye Harden genuinely guessed from scratch that the first three words of Z408 would be “I LIKE KILLING”, as has been reported. Instead, it seems far more likely to me that she had already worked for several hours on the cipher before making such an inspired guess.)

And so it seems most likely to me that the Zodiac Killer conceived his delta specifically as a way of disrupting the weakness of doubled letters (specifically doubled L), but without really affecting the rest of his code-making approach. And as always in cryptography, there are numerous ways this could be achieved:
* removing the second letter of all doubled letter pairs
* adding in new tokens for specific doubled letters (e.g. use ‘$’ to encipher ‘LL’)
* disrupt the order of the letters (i.e. transpose them) so that ILIKEKILLING becomes IIEILN LKKLIG etc

I’m therefore wondering if his cipher system delta was some kind of period-19 transposition. But – of course – people have already checked for the presence of straightforward period-19 transposition, and have basically drawn a blank. So if there is a period-19 ‘signature’ arising from some kind of transposition, it’s a little more complicated.

But if so, then what would it look like?

A three-way line dance?

My final piece of observational jigsaw in today’s reasoning chain is that the Z340 ciphertext is apparently arranged in groups of three lines. FBI cryptanalyst Dan Olson famously commented that…

Lines 1-3 and 11-13 contain a distinct higher level of randomness than lines 4-6 and 14-16. This appears to be intentional and indicates that lines 1-3 and 11-13 contain valid ciphertext whereas lines 4-6 and 14-16 may be fake.

…though note that this mixes up observation (the first sentence) with his best-guess inference (the second sentence). What I’m instead taking is that Olson’s observation more generally implies that lines are somehow grouped together in sets of three BUT with a spare line added in between the top and bottom half.

So, the overall line grouping sequence of the Z340 appears to be:
* top half: 1-1-1 2-2-2 3-3-3 X [a spare line with “cut marks” at either end of a fake line]
* bottom half: 4-4-4 5-5-5 6-6-6 X [a spare line with ‘ZODAIK’-like fake signature at the end]

Hence – putting it all together – I’m now wondering whether there is a period-19 transposition in play here BUT arranged in groups of three lines at a time. In which case, the symbol sequence for each set of three lines (3 x 17 = 51) might well look like this (where 01 is the first symbol of the plaintext, 02 is the second symbol, etc):

* 01 04 07 10 13 16 19 22 25 28 31 34 37 40 43 46 49
* 47 50 02 05 08 11 14 17 20 23 26 29 32 35 38 41 44
* 42 45 48 51 03 06 09 12 15 18 21 24 27 30 33 36 39

This transposition arrangement would yield both the period-19 effect and the groups-of-three-lines effect: and might also go some of the way towards explaining why lines 10 and 20 function differently to the other lines.

As I mentioned at the top of the post, I also strongly suspect that the top half of the Z340 and the bottom half of the Z340 are separate ciphertext systems, and so any solving should be attempted on the two halves individually, however inconvenient that may be. 🙂

I haven’t tested out this new transposition hypothesis yet: but it’s definitely worth a look, wouldn’t you think, hmmm?

The Voynich Manuscript’s zodiac roundel section has long frustrated researchers’ efforts to make sense of it at a high level, never mind determining what any specific zodiac nymph’s label means.

However, I can now see the outline of a new hypothesis that might explain what we’re seeing here…

A Stylistic Impasse?

The fact that each zodiac sign has thirty nymphs, thirty stars and thirty labels (all bar one?) would seem to be a good indication that some kind of per-degree astrology is going on here: and this is a lead I have pursued for many years.

The literature on this, from Pietro d’Abano to Andalo di Negro to (the as yet unseen) Volasfera, is uniformly Italian: so it would seem a relatively safe bet that the source of this section is also from that same Italian document tree.

At the same time, the observation that the drawings in the zodiac roundels are stylistically quite distinct from the rest of the Voynich Manuscript’s drawings has been made many times.

Combine this with the fourteenth century technological dating for the (unusual) Sagittarius crossbow, and you get loosely driven towards a working hypothesis that at least the central figures were copied from a (still unknown) late 14th century or early 15th century woodcut almanach, of the type that was most commonly found in Germany and Switzerland.

However, this leads to an awkward stylistic impasse: how can this zodiac section be both Italian and German at the same time?

Klebs and Martin

Back in 2009, I mentioned Arnold Klebs’ very interesting 1916 article on the history of balneology in the context of discussing Quire 13. However, there was another intriguing quote there that I only got round to chasing up a few days ago:

The yearly pilgrimages to the healing springs in the month of May, the baths of the women on St. John’s Day, which Petrarca describes so picturesquely in one of his letters from Cologne, were ancient survivals, indications of a deeply rooted love for and belief in the purifying powers of the liquid element. These seasonal wanderings to the healing springs were naturally brought into relation with astral conjunctions, a tendency soon exploited by the calendar makers and astrological physicians. Days and hours were set for bathing, blood-letting, cupping, and purging, carefully ascertained by the position of the stars. Martin in his book gives a great variety of such instances which offer interest from many points of view.

The author and book to whom Klebs is referring here is Alfred Martin and his immense (1906) “Deutsches Badewesen in vergangenen Tagen“, Jena : Diederichs. (The link is to archive.org .)

It turns out that Klebs sourced a great deal of his article from Martin’s labour of love (with its 159 illustrations and its 700-entry bibliography), which covers public baths, private baths, Jewish baths, bath-related legislation, mineral baths, bath architecture, bath technology, spas, saunas, and so forth, ranging from Roman times all the way up to 1900, and with a dominant focus on German and Swiss archival sources.

The Zodiac Bath Hypothesis

You can by now surely see where I’m heading with this: a zodiac bath hypothesis, where the Voynich’s zodiac section was in some way a copy of a German/Swiss original, which itself brought together the two traditions of per-degree astrology and good/bad times for “bathing, blood-letting, cupping, and purging” (as described by Klebs).

In some ways, this should be no surprise to anyone, given that the first few nymphs are all sitting in barrels, which were essentially what medieval private baths were (well, half-barrels, anyway).

And perhaps, in the context of clysters (enemas), it’s not inviting too much trouble to speculate what legs drawn apart / together might be representing. 🙂

The problem is that – probably because of my only fragmentary German – I can’t find any mention of “Days and hours were set for bathing, blood-letting, cupping, and purging, carefully ascertained by the position of the stars” in Martin’s German text.

I can see plenty of references to blood-letting (“aderlass”) etc, but pinning down the exact part that Klebs robbed out has proved to be beyond me.

Can I therefore please ask a favour of (one or more of) my German readers; which is simply to find the section in Alfred Martin’s book to which Krebs was referring? Thanks! 🙂

It would seem likely that this will then refer to a book in Martin’s capacious bibliography, at which point the game is (hopefully) afoot!

OK, so there’s like another Zodiac film coming out this summer (2017), and it’s like called Awakening The Zodiac. And if that’s not just like totally thrilling enough for you kerrrazy cipher people already, there’s also a trailer on YouTube long enough to eat a couple of mouthfuls of popcorn (maybe three tops):

I know, I know, some haters are gonna say that it’s disrepectful to the memory of the dead, given that the Zodiac claimed to have killed 37 people, and that the film makers are just building cruddy entertainment on top of their families’ suffering. But it’s just Hollllllllllywood, people, or rather about as Hollywood as you can get when you film it on the cheap in Canada. Though if the pitch was much more elaborate than “Storage Hunters meets serial killer”, you can like paint my face orange and call me Veronica.

Seriously, though, I’d be a little surprised if anyone who knows even 1% more than squat about ciphers was involved: if my eyes don’t deceive me, there certainly ain’t no “Oranchak” in the credits. Maybe there’ll turn out to be hidden depths here: but – like the Z340 – if there are, they’re very well hidden indeed.

In a recent post here, I floated the idea that the Zodiac Killer’s Z408 (solved) cipher’s unusual homophone distribution may have arisen not conceptually (i.e. from a hitherto-unknown book on cryptography), but instead empirically (i.e. emerging from the properties of a specific text).

It’s certainly possible that he might have used his own (private) text to model his homophone distribution, in which case we probably almost no chance of reconstructing it. However, I think it likely that he instead used the first few characters of an already existing public text (such as Moby Dick, the Book of Genesis, the Declaration of Independence, or whatever) to do this.

It’s a reasonable enough suggestion, I think: and moreover one that we can try to test to a reasonable degree.

Z408’s homophones

A homophonic cipher key allocates a number of cipher shapes to individual plaintext letters, usually (but not always) in broad proportion to their frequency. So in a typical homophonic cipher key you would expect to see far more shapes for E (the most common letter in English) than for, say, Z or Q.

Though this is essentially the case for what we see in the Z408 cipher (particularly for the more frequent letters, ETAOINS), the numbers of homophones chosen for the less frequent letters seem somewhat idiosyncratic and arbitrary:

7 shapes – E
4 shapes – T A O I N S
3 shapes – L R
2 shapes – D F H
1 shape  – B C G K M P U V W X Y
Did not appear: J Q Z

People have long searched for a primer or textbook on cryptography where the description of the alphabetic frequency distribution matches this, or even where the alphabetic frequency ordering (e.g. ETAOINSHRDLU etc) matches the order here, but in vain.

Designing a filter

The basic idea for the filter is easy enough:
* read in characters from the start of a passage (we’re only interested in capitalized alphabetic letters, i.e. A-Z)
* if the instance count of that character is higher than the top of the desired range, then the test fails
* if the instance counts for all the characters are within the desired range at the same time, then the test passes
* else keep reading in more characters until the test terminates

As a side note: of all the Z408 homophones, only X appears exactly once in the Z408 ciphertext itself: but while it is conceivable that the Zodiac Killer might have allocated extra homophones for X, it does seem fairly unlikely.

The desired ranges for each of the characters would look like this (though feel free to adapt this if you disagree with the homophone counts listed above):

[7,7] – E
[4,4] – T A O I N S
[3,3] – L R
[2,2] – D F H
[0,1] – B C G K M P U V W Y J Q Z
[0,3] – X (to err on the side of safety)

Note that the single-letter characters have a slightly broader [0,1] range because we have no way of knowing whether or not they would have actually appeared in the original text.

Here are two test texts that should both pass:

EEEEEEETTTTAAAAOOOOIIIINNNNSSSSLLLRRRDDFFHHZZZZZZZZZZZZZZZZZ

BCGKMPUVWYJQZXEEEEEEETTTTAAAAOOOOIIIINNNNSSSSLLLRRRDDFFHHZZZ

Which texts to try?

Though any text published before August 1969 would potentially be a match, it would make sense to look at all manner of texts, and possibly even the first few lines of different chapters of books (though I’d be a little surprised if that was the case). All the same, the filter is easy enough to write (and should execute in a matter of microseconds) and to test, so the difficulty here lies mostly in getting hold of enough texts to try, rather than the compute time as such.

Oddly, I don’t really have a solid feel for how often the filter will find a match: my gut instinct is that roughly one in a million English text comparisons will pass, but that’s just a guesstimate based on each letter having its own little bell-curve distribution, all of which have to match at the same time.

So what do you think will match? “Catcher in the Rye” or “Moby Dick”? Place your bets! 😉

Given that the Zodiac Killer’s first big cipher (the Z408) got cracked so quickly, it shouldn’t really be a surprise that he used a slightly different system for his second big cipher (the Z340). What is (arguably) surprising is that whatever change he made to it has not been figured out since then.

But what was he thinking? What did he want from a cipher? And how might his needs have changed between Z408 and Z340?

The Z408

Ciphers are normally made to be as strong as practically possible, given the technological, time, and resource constraints that apply to both sender and receiver: and with the two main driving needs being privacy and secrecy. Note that these aren’t always the same thing: the way I usually describe it is that while sex with your husband is private, sex with your tennis coach is secret. 😉

And so the first thing I find cryptographically interesting about the Zodiac Killer is that he was creating a cipher from a slightly angle from either of these: and he certainly wasn’t trying to communicate in any normal sense of the word.

Rather, I think that the point of Z408 was to be taunting, and to demonstrate to the police that he was in control, not them.

So imagine the Zodiac’s probable fury, then, when little more than a week after his three Z408 cryptograms appeared in local newspapers (the Vallejo Times-Chronicle, the San Francisco Examiner and the San Francisco Chronicle), Donald and Bettye Harden were all over the front pages explaining how they had cracked them.

Didn’t they know who was supposed to be in control here?

What was worse, the Hardens hadn’t used cryptological hardware or even high-powered cryptological smarts. They’d just used the Zodiac’s egoism (they guessed the first letter was “I”) and his psychopathic bragging (they guessed he would use the word KILL multiple times) as keys to his cryptographic front door: and then marched straight in.

I think it’s fairly safe to expect that the Zodiac was pretty pissed off by this.

Note that the Hardens carried on trying to crack the Z340 for many years afterwards: according to their daughter, her “mother wrote poetry and was as absorbed in her writing as she became with the Zodiac codes. She worked on the second code on and off for the rest of her life.

The Z340

Comparing the overall style of the Z340 with that of the Z408, there seems to be plenty of reasons to think that the two are, at heart, not wildly different from each other. And yet (as is widely known) all the big-brained homophonic solvers written since haven’t made any impact on the Z340 at all.

All the same, I think the second interesting thing to note is that the changes to the Z340 system were surely not made to defend against computer-assisted codebreaking (because that hadn’t yet happened), but rather to make the updated system Harden-hardened, so to speak.

What does this mean? Well, we can probably infer that the first letter of the Z340 is almost certainly not I (not that that helps us a great deal) and the Zodiac Killer must have done something to conceal or remove the KILL weakness.

But, in my opinion, that latter change would surely not have been a theoretically-motivated cryptographic adaptation (he was without much doubt an amateur cryptographer), but rather something pragmatic and empirical, perhaps along the lines of:
* adding a repeat-the-last-letter token
* add an LL token
* add an ILL token
* add nulls inside tell-tale words
* etc

But there’s a problem with all of these. In fact, there are several problems. 🙁

The Problems

The first problem is that I don’t currently believe any of the above changes are disruptive enough to explain what we see in the Z340.

The basic stats of the four main Zxxx ciphers are:
Z408: 408 symbols, from a set of 54 unique symbols. (Note: E has 7 homophones, AST have 6 each, IO have 5 each, N has 4, FLR have 3 each, DHW have 2 each, everything else has 1).
Z340: 340 symbols, from a set of 63. [Hence symbols/textsize is 18.5%, a fair bit higher than the Z408’s 13.3%]
Z32: 32 symbols, from a set of 30.
Z13: 13 symbols, from a set of 8.

It would be very tempting to suspect (as many people have) that the Z340 is ‘therefore’ just the same as Z408 but with 39% more homophones. Yet a problem with this popular hypothesis is that it should be well within range of automated homophone solvers, and to date they haven’t managed to make any impact.

A second problem is that the kind of homophone cycles that so characterized the Z408 seem to be largely absent in the Z340: and yet because the Zodiac Killer would not have had any clue that these were a technical weakness of his system, it seems unlikely to me that he would have adjusted his system to work around a weakness that he didn’t actually know was a weakness.

A third problem is that the Z340 has a fair number of asymmetries that don’t fit the it’s-a-straight-homophonic-cipher model. For example, lines 1-3 and 11-13 have (as Dan Olson pointed out some years ago) almost no character repeats.

There are yet other asymmetries: for example, while 63 different symbols appear in the top ten lines, only 60 appear in the bottom ten lines. And there’s the mysterious ‘-‘ shape at the start and end of line 10: and the odd-looking “ZODAIK” sequence on line 20.

One final asymmetry: the ‘+’ shape seems to function differently in the top and bottom halves – it is often preceded by ‘M’ in the top half, but never preceded by ‘M’ in the bottom half.

How does assuming the Z340 is a pure homophonic cipher explain any of these behaviours, let alone all of them?

Lines 1-3 and 11-13, revisited

I keep coming back to the 1-3 and 11-13 property as mentioned here. I think it’s important to say that Dan Olson’s conclusion (that “lines 1-3 and 11-13 contain valid ciphertext whereas lines 4-6 and 14-16 may be fake”) seems likely to be landing a little bit wide of the mark.

To me, this same property of these lines implies (a) that the homophonic versions for each letter were probably used in pure sequence here, but also (b) the homophone cycles were somehow ‘reset’ after ten lines (i.e. the homophone cycles all started again at the start of line eleven). And perhaps also that any characters repeated in the first three lines are rarer characters, rather than the homophone-friendly ETAOINSHRDLU etc.

It might even be that the Zodiac Killer kept on adding homophones as he constructed the cipher UNTIL he had three lines’ worth of essentially unique homophones: that is to say, that the three line blocks in 1-3 and 11-13 are how his system made the choice of the number of homophones, rather than as a consequence of the number of homophones he chose. Nobody has yet (to my knowledge) satisfactorily explained where he came up with his homophonic allocation for Z408: certainly, searching for this in crypto books hasn’t yielded any likely candidates.

Could it be that the Zodiac Killer worked backwards from his actual Z408 ciphertext to determine the number of homophones, rather than worked forward from the number of homophones to the ciphertext?

Update: I received the following off-line comment from David Oranchak, but thought it better to update it within the post itself…

Nick, there are a few other seemingly rare phenomena that can be observed in Z340. I’m curious what you think of them.

The first is the pivots:

http://zodiackillerciphers.com/wiki/index.php?title=Encyclopedia_of_observations#The_.22Pivots.22

Those kinds of patterns are difficult to arise by chance, so they are suspected to be some sort of feature of the encoding scheme.

Z408 is littered with repeating bigrams but Z340 seems to have fewer than would be expected via normal homophonic encipherment of a plaintext in a normal reading direction. However, the bigrams show up again if you consider a periodic operation on the cipher text:

http://zodiackillerciphers.com/wiki/index.php?title=Encyclopedia_of_observations#Periodic_ngram_bias

The count of 25 repeating bigrams jumps to 37 or 41 or even higher, depending on the periodic operation applied to the cipher text. Here is a tool that illustrates the various operations:

http://zodiackillerciphers.com/period-19-bigrams/

You’ve already identified the seemingly rare phenomenon of rows that lack repeating symbols. There are 9 such rows. In 1,000,000 random shuffles of Z340, none had that many rows. In fact, the best that was found was 8 rows which occurred in only 12 of the shuffles.

Your “M+” asymmetry observation seems to fit in with the general observation that repeating bigrams are phobic of certain regions of the text. The lower left, for instance, seems to hate bigrams: http://zodiackillerciphers.com/images/z340-repeating-bigrams.png

Another really strange observation is the distribution of non-repeating string lengths. For each position of Z340, measure how far you can read forward without encountering a repeating symbol. You end up with a string with unique sequences of length L. Jarlve found that for Z340, there is a peak of 26 occurrences of unique sequences of length 17 (which happens to be the width of Z340). It is really interesting that in random shuffles, this phenomenon is only observed on the order of one in a billion shuffles.

Finally, I would recommend that anyone interested in this topic should check out this thread on morf’s Zodiac forum: http://zodiackillersite.com/viewtopic.php?f=81&t=3196 Especially the more recent posts on the latter pages. “Jarlve” and “smokie” in particular are doing fantastic work exploring various transcription schemes that could explain the various curious features of Z340 (in particular, the relationships between periodic bigrams and transposition schemes).