I posted up seven homophonic challenge ciphers a few days ago, and now – though it may sound a little counter-intuitive – I’d like to try to help you solve them (bear in mind I don’t know if they can be solved, but the whole point of the challenge is to find out).
Of the seven ciphers, #1 is the longest (and hence probably the easiest). Reformatted for ten columns rather than five (it uses five cycling alphabets ABCDE, ie. “ABCDE ABCDE” over ten columns):
Commenter Jarlve (whose interesting work on the Zodiac Killer ciphers some here may already know) noted that there is a repeated quadgram here, i.e. the sequence 408 500 113 203 appears twice.
This is entirely true, and also a very sensible starting point: I’ve highlighted this quadgram in the following diagram, along with all other repeated A-alphabet tokens (i.e. 100..125), and also any tokens they touch more than once (i.e. in the B and E alphabets):
Another thing that’s interesting here is that the 102 token (that appears four times and is coloured purple in the above) appears with four different letters before it as well as four different letters after it. In classical cryptology, that’s normally taken as a strong indicator that this is a vowel: and with the high instance count (4 out of 31, i.e. 12.9%), you might reasonably predict that this is E, A, O, or perhaps I (in order of decreasing likelihood).
[Note that I haven’t looked to check what letter this actually is: having created the challenge ciphers, I’ve just left them to one side, and don’t intend to look again at them.]
Similarly, the 114 token (that appears three times and is coloured green) is always preceded by 509, and is followed by 209 on two of the three instances. (Note that the token two after it is 309 in two of the three instances as well.) Again, in classical cryptology, these kind of structured contacts are normally taken as strong indicators that this token enciphers a consonant: and with the high instance count (3 out of 31, i.e. 9.7%), you might reasonably predict that this enciphers T or possibly N, S, or H.
With these two examples in mind, it strikes me that for any given plaintext language (English in the case of these challenge ciphers) you could easily build up probability tables for repetitions of the two tokens before and the two tokens after any given token: and then use those as a basis to predict (for a given ciphertext length) which plaintext letter they imply the letter is likely to be.
Though this may not sound like very much, because you can do this for all five of the alphabets independently, the results kind of rake across the ciphertext, yielding a grid of probabilistic clues that some clever person might well use as a basis for working towards the plaintext in ways that wouldn’t possible with randomly-chosen homophonic ciphers. Just sayin’. 😉
And The Point Is…
It’s entirely true that for homophonic ciphers where each individual cipher is chosen at random, the difficulty of solving a reasonably short cipher with five homophones per letter would be very high. But knowing (as here) that each column is strictly limited to a given sub-alphabet, my point is that many of the tips and tricks of classical cryptology are also available to us, albeit in slightly different forms from normal.
Yet while it’s encouraging for solvers that there is a repeated quadgram here, I don’t currently believe that cipher #1 will be (quite) solvable with pencil and paper, as if it were a Sudoku extra-extra-hard puzzle (though as always, I’d be more than delighted to be proved wrong).
However, my hunch remains that strictly cycling homophonic ciphers may well prove to be surprisingly solvable using deviousness and computer assistance, and I look forward very much to seeing how they fare. 🙂