As I mentioned here and indeed here a few days ago, my usually-Early-Renaissance-focused thoughts have of late been turning slowly to the Zodiac Killer Ciphers, in particular to the unsolved 340-character cipher known as “Z340”. Unusually as cipher mysteries go, we also have an earlier cipher called “Z408” (no prizes for guessing its length) by the same person, one that was quickly cracked (using the crib “KILL”). Z408 turned out to be a homophonic simple substitution cipher (but with spelling mistakes, copying mistakes, and a few subtly odd features); and there are plenty of good reasons to think that Z340 will share many of these same basic aspects (but made somewhat harder to crack).
Even though it was originally a crib which helped to crack it, Z408 has other weaknesses, most notably the way it sequentially cycles through homophones (“multiple ciphertext shapes for the same plaintext character”). For example, plaintext ‘t’ maps to the four ciphertext homophones HI5L, and appears in the text as the sequence HI5LHI5ILHI5LHI5LHI5LHI5LI5LHL5IIHI. If you count each successful letter-to-letter transition matching the modulo-4 sequence [HI5L] as a 0.25 success event (=26) and each non-match (=8) as a 0.75 failure event, I believe you get a raw probability of less than 1 in a billion (i.e. of at least 26 successes from 34 events). Please check my maths, though – I used this online binomial calculator with N = 35-1, k = 26, p = 0.25, q = 0.75. For more on these homophone sequences, Zodiac ciphermeister Dave Oranchak kindly pointed me at a full list of Z408 homophone sequences.
Incidentally, the top few match counts are:-
e -> ZpW+6NE – N = 54-1, k = 38
t -> HI5L – N = 35-1, k = 26
s -> F@K7 – N = 20-1, k = 15
o -> X!Td – N = 27-1, k = 13
n -> O^D( – N = 23-1, k = 20
i -> 9PUk – N = 44-1, k = 35
a -> GSl8 – N = 26-1, k = 10
It would be great to tell you how statistically significant these sequences are, but I know enough stats to know that it’s not quite as easy as it looks (for a start, we’re preselecting the best order of letters to use) – any passing statisticians, please feel free to leave a comment. I’m also quite surprised that nobody has apparently tried to use this weakness as a direct way to find the Z340 cipher’s homophones (in fact, John Graham-Cumming also blogged about this in June this year), but – as I’ll show shortly – I suspect trying just that on its own wouldn’t be enough.
Taking a brief step sideways, I’m always intrigued by mistakes in ciphers, because these often point to how the cipher was constructed. One interesting feature (but which I’m still trying to understand to my own satisfaction) is the solid triangle cipher shape in Z408, and how it appears to encipher different letters at different times. The view often put forward elsewhere is that this varied due to copying errors, perhaps arising because the Zodiac Killer’s pen was too thick, causing him to misread his draft version. As for me, I’m not so sure, because the solid triangle decrypts to a curious sequence:-
* “A” in “bec-A-use”
* “S” in “mo-S-t dangerous”
* “A” in “an-A-mal”
* “S” in “mo-S-t thrilling”
* “A” in “with -A- girl”
* “S” in “if it i-S-”
* “E” in “my slav-E-s”
* “A” in “my -A-fterlife”
Of these, only the “A” in “an-A-mal” is possibly a copying error (“I” is enciphered by an empty triangle shape) as compared to just a spelling mistake (the Zodiac Killer has plenty of those). But even that seems a little unlikely when the whole ASASAS[E]A pattern that emerges – so very similar to the homophonic sequences discussed above – is pointed out. I haven’t yet figured out what this implies, but it’s pretty interesting, right?
Moving on to the uncracked Z340 cipher, I have to say that what strikes me most is the difference between its top half (lines 1-10) and its bottom half (lines 11-20). It turns out that back in 2009, FBI codebreaker Dan Olson pointed out to Tom at zodiackiller.com that lines 1-3 and 11-13 contained very few repeats: other people have wondered whether this points to some kind of block-level transposition going on. Me, I suspect there’s a far stronger inference to be made: that even though they share nearly all the same character shapes, I’m pretty sure that the top and bottom halves of Z340 use completely different cipher letter assignments, and hence may well need to be cracked independently. Further, I suspect that the Zodiac may well have intended to send them out separately (Z408 was sent as three independent sections), but (for some reason) ended up sending them both as a single cipher.
[Incidentally, I also don’t believe that the last few letters of the bottom half of Z340 are genuinely part of the ciphertext to be cracked: they seem to spell “ZODAIK”, which is just a touch too coincidental for me. 🙂 ]
Right now, I think that a constructive first big step would be to search for statistically significant homophone sequences in the top and bottom halves of Z340, because we can be reasonably sure that the most frequent letters will probably have four or more homophones, just as with the Z408 cipher: trying this out may well yield some surprisingly revealing results. Any takers at the FBI? 😉