The 408-symbol-long Zodiac Killer cipher (‘Z408’) was cracked by Donald and Bettye Harden in 1969 while the next 340-symbol-long Zodiac Killer cipher (‘Z340’) arrived not long after: ever since then, there has been a widespread presumption among researchers that the later cipher would just be a more complicated version of the earlier cipher (e.g. perhaps transposed in some way).
The Z340 certainly resembles Z408, insofar as the cipher shapes employed in both were very similar, and that certainly lends support to the widely-held presumption that Z340 uses the same kind of ‘pure’ homophonic cipher system. But is that the whole story? Personally, I’m not so sure…
Unusual Aspects of the Z340
It has long been pointed out that the Z340 cipher sports a number of idiosyncratic features that are not present in the earlier Z408 cipher. For example, the FBI’s Dan Olson pointed out a few years ago that:
* Statistical tests indicate a higher level of randomness by row, than by column. This indicates that the cipher is written horizontally and rules out any transposition patterns that are not strictly horizontal.
* Lines 1-3 and 11-13 contain a distinct higher level of randomness than lines 4-6 and 14-16. This appears to be intentional and indicates that lines 1-3 and 11-13 contain valid ciphertext whereas lines 4-6 and 14-16 may be fake.
* Because of the vertical symmetry of the statistical observations, the message may have been written, then split into two equal size parts and placed top over bottom.
These suggest that something odd might be going on though inside the cipher: in this respect, the Z340 cipher resembles the Voynich Manuscript’s frustrating ‘Voynichese’, which looks straightforward on the surface but which turns out to have many behavioural features which are not seen in other known ciphers.
I’d also add that row #10 starts and ends with ‘-‘, which looks somewhat artificial – though it could just be random, it may also have some kind of meta-significance for the interpretation of the overall cryptogram (e.g. “CUT HERE”).
Finally, I’d add that Z340’s final (20th) line looks very much as if it contains a mangled ZODIAK signature, which – if correct – would probably make sense as 50% crypto padding, and 50% flipping the bird at the FBI. 😉
Anyway, given that the message contains 20 lines of 17 symbols (20 x 17 = 340) and we can see similar artefacts in rows 1-3 and 11-13, then it seems likely to me that there was some kind of major coding break after row #10.
Consequently, I’ve long wondered whether the two halves of Z340 (let’s call them ‘Z170-A’ and ‘Z170-B’) used a different set of cipher-symbol-to-plaintext-letter assignments to each other: in which case, the sensible way to make progress would be to try to solve each half separately. Even so, we would still need to eke out some additional assistance (or meta-assistance) from the texts to make progress, because the odds are so heavily stacked against us.
Yet there’s another feature of the Z340 cipher which struck me a while back but which I haven’t got round to blogging about until now. It’s all to do with doubled shapes, and the story starts with the Z408 ciphertext…
Z408’s doubled letters
To construct Z408, the Zodiac Killer used 7 shapes for { E }, 4 shapes each for { T, A, O, I, N, S }, 3 shapes each for { R, L }, 2 shapes each for { D, F, H }, and 1 shape each for the rest (probably): this yields a grand total of 54-ish cipher shapes to encipher 26 plaintext letters.
Given that the instance count curve for the English alphabet is often described as “ETAOINSHRDLU…”, this tiered arrangement makes sense (as I recall, various researchers have tried to use the homophone allocation to infer which popular cryptography manual the Zodiac Killer specifically relied upon, but I don’t remember if there was a definitive answer to that question).
However, one particular letter caused him a lot of practical problems for Z408: the letter L. Even though this has a relatively small frequency count (compared to, say, the letter E), the particular text he enciphered included numerous ‘LL’ pairs. That is kind of what you get if you want to say the word ‘KILL’ all the time: the words with a double-L are KILLING, KILLING, ALL, KILL, THRILLING, WILL, ALL, KILLED, WILL, WILL, WILL, and COLLECTING.
(As an aside, I’ve often wondered whether the multiple repetitions of the word “WILL” might possibly imply that the Zodiac Killer’s first name was indeed “WILL” / William. The subconscious is a funny ‘wild animal’ in that way.)
Anyway, as a direct result of this, the letter L is used here more often than its normal English stats would suggest: and so the Zodiac Killer had to encipher ‘LL’ 12 times with only three shapes in its tier. To avoid pattern repetitions, he ended up doubling up the enciphered L-shapes a few times, and so the final Z408 ciphertext included a number of doubled L shapes.
The only other doubled letter was ‘G’, which only had a single shape allocated to it, and which appeared doubled only once.
Z340’s Problem With ‘+’
If Z340 (which uses 63 distinct shapes) uses a similar kind of homophonic cipher to the one used in the Z408 cipher (which uses 54 distinct shapes), then I would say it has a very specific problem with whatever is being enciphered by the shape ‘+’.
‘+’ occurs 24 times (7% of the total number of characters, and exactly double that of ‘B’, the second most frequent shape), which by itself largely makes a nonsense of the suggestion that Z340 is a homophonic cipher: anything with that high a frequency count should surely have a whole set of homophones to represent it.
You might wonder whther ‘+’ enciphers a frequent word or syllable, such as ‘THE’ or ‘ING’. However, it appears three times immediately doubled with itself, i.e. ‘++’ (the only other letter that appears doubled is the sequence ‘pp’ that occurs once near the start of row #4).
Even if, as Dave Oranchak did, you do a brute force search for homophone cycles (don’t get me started on what they are, or we’ll be here all night), you don’t find anything that accounts for Z340’s ‘+’ shape.
And yet, as Dave Oranchak points out, Z340 has some strong-looking homophone cycles, such as [l*M] [l*M] [l*M] lM [l*M] [l*M] [l*M], which would seem to imply that Z340 is at heart a homophonic cipher. There are plenty of other measures (many noted by my late friend Glen Claston) that point in the same direction,
Moreover, because the number of shapes used is greater than for the Z408 cipher, you would naturally expect to see more tiers or wider tiers (though 7 shapes for E was already quite a wide tier). So you would naturally expect to see a consequent flattening of the statistics. And yet ‘+’ bucks that trend completely.
How Can We Reconcile These Two?
As a starting point, you might note that ‘M+’ occurs three times in the top half, but not at all in the bottom half. In fact, M is always followed by + in the top half, and never followed by + in the bottom half (where it occurs four times).
It seems to me that the ‘+’ shape makes the top half (Z170-A) easy and the bottom half (Z-170-B) difficult all at the same time. And that’s not something that I personally can comfortably reconcile with the kind of one-size-fits-all pure homophonic solutions most people seem to be looking for, even with confounding transposition stages thrown in: the behaviour of the ‘M+’ pattern would seem to point away from almost all of the transposition variants previously proposed.
Having really, really thought about it, my tentative conclusion is that ‘+’ seems to operate more as a kind of meta-token rather than as a pure token. I mean this in the same general way that certain Voynichese letters seem to me to encipher 15th century shorthand tokens (‘contractio’, etc).
A Suggestion
As I recall, the Hardens found their crib by guessing that the first letter was “I” and then looking for the word “KILL”: the ease of which doubtless made an already angry psychopath even angrier than he already was.
Hence to my mind, the thing he would most likely have been looking to solve when moving from his Z408 cipher system to his Z340 cipher system was how to make that new system impervious to that specific kind of an attack. And the key letter that let him down first time round was the letter ‘L’, specifically in its doubled form.
Consequently, I propose that this was the single technical challenge that spurred the internal changes from Z408 to Z340. And there was one obvious – but admittedly very old-fashioned – trick that he could have used to make doubled letters harder to see.
So here’s my suggestion. Could it be that the Zodiac Killer used ‘+’ as a meta-token to mean “REPEAT THE LAST LETTER“? (‘++’ would then mean a tripled letter, or perhaps something else entirely).
If that’s correct, I would further expect that ‘M’ was one of the homophones for ‘L’, and the [l*M] cycle could very well have been the 3-long homophone loop for ‘L’.
Do you really believe that the Zodiac could write a taunting message to the police without using the four-letter sequence “KILL”? In many ways, that was kind of the whole point.