I posted up seven homophonic challenge ciphers a few days ago, and now – *though it may sound a little counter-intuitive* – I’d like to try to help you solve them (bear in mind I don’t know if they can be solved, but the whole point of the challenge is to find out).

Of the seven ciphers, #1 is the longest (and hence probably the easiest). Reformatted for ten columns rather than five (it uses five cycling alphabets ABCDE, ie. “ABCDE ABCDE” over ten columns):

`121,213,310,406,516, 108,200,323,416,513,`

112,208,308,409,515, 102,216,309,425,509,

114,215,309,417,507, 102,201,323,401,517,

111,200,306,408,500, 113,203,313,407,512,

103,223,313,403,511, 119,213,316,416,511,

102,204,324,418,517, 120,203,324,407,516,

105,209,312,401,504, 117,208,310,408,500,

113,203,301,425,513, 115,201,313,408,515,

115,214,308,406,501, 122,204,322,408,509,

114,209,305,412,504, 117,213,316,402,509,

100,200,310,423,513, 100,214,320,419,509,

114,209,309,419,520, 101,200,320,416,518,

120,211,313,403,509, 103,207,313,421,513,

107,209,305,407,523, 115,224,313,416,508,

102,203,306,416,514, 107,200,310,401,509,

103,212,324,

## Repeated Quadgram

Commenter Jarlve (whose interesting work on the Zodiac Killer ciphers some here may already know) noted that there is a repeated quadgram here, i.e. the sequence 408 500 113 203 appears twice.

This is entirely true, and also a very sensible starting point: I’ve highlighted this quadgram in the following diagram, along with all other repeated A-alphabet tokens (i.e. 100..125), and also any tokens they touch more than once (i.e. in the B and E alphabets):

Another thing that’s interesting here is that the 102 token (that appears four times and is coloured purple in the above) appears with four different letters before it as well as four different letters after it. In classical cryptology, that’s normally taken as a strong indicator that this is a vowel: and with the high instance count (4 out of 31, i.e. 12.9%), you might reasonably predict that this is E, A, O, or perhaps I (in order of decreasing likelihood).

[Note that I haven’t looked to check what letter this actually is: having created the challenge ciphers, I’ve just left them to one side, and don’t intend to look again at them.]

Similarly, the 114 token (that appears three times and is coloured green) is always preceded by 509, and is followed by 209 on two of the three instances. (Note that the token two after it is 309 in two of the three instances as well.) Again, in classical cryptology, these kind of structured contacts are normally taken as strong indicators that this token enciphers a consonant: and with the high instance count (3 out of 31, i.e. 9.7%), you might reasonably predict that this enciphers T or possibly N, S, or H.

With these two examples in mind, it strikes me that for any given plaintext language (English in the case of these challenge ciphers) you could easily build up probability tables for repetitions of the two tokens before and the two tokens after any given token: and then use those as a basis to predict (for a given ciphertext length) which plaintext letter they imply the letter is likely to be.

Though this may not sound like very much, because you can do this **for all five of the alphabets independently**, the results kind of rake across the ciphertext, yielding a grid of probabilistic clues that some clever person might well use as a basis for working towards the plaintext in ways that wouldn’t possible with randomly-chosen homophonic ciphers. Just sayin’. 😉

## And The Point Is…

It’s entirely true that for homophonic ciphers where each individual cipher is chosen at random, the difficulty of solving a reasonably short cipher with five homophones per letter would be very high. But knowing (as here) that each column is strictly limited to a given sub-alphabet, my point is that many of the tips and tricks of classical cryptology are also available to us, albeit in slightly different forms from normal.

Yet while it’s encouraging for solvers that there is a repeated quadgram here, I don’t currently believe that cipher #1 will be (quite) solvable with pencil and paper, as if it were a Sudoku extra-extra-hard puzzle (though as always, I’d be more than delighted to be proved wrong).

However, my hunch remains that strictly cycling homophonic ciphers may well prove to be surprisingly solvable using deviousness and computer assistance, and I look forward very much to seeing how they fare. 🙂

Reading Comprehension fail – I must have misunderstood your previous post – isn’t this sort of a Quagmire cipher (ie a Vignere with no ‘key’ per se’)?

So only the length is killing us? Surely a frequency analysis (firstly of groups of letters to guess a key length (yes I know we know it’s 5)) and then one on that actual (theorized) key size would put us in a neat place to do a frequency analysis….

Milongal: it’s not a Vigenere because each of the five alphabets are mapped differently. What’s interesting is that nobody seems to have tried to analyze this cipher system, because picking homophones at random is so much stronger. But… here we are with the Scorpion Ciphers. 🙂

Nick: well, it is. Its a polyalphabetic cipher with five cycling monoalphabetic substitution ciphers. You’ve just assigned visually different alphabets to each.

The alphabets are scrambled, which makes it harder, but letter frequency statistics still apply within each column.

SirHubert: I’ve only presented the ciphers in that way so that there is absolutely no question about what is going on inside. There is no doubt that having five strictly cycling alphabets is significantly harder than a single alphabet, but it’s also far easier than situations where each homophone instance is chosen randomly from a set of five. Hence the reason for the challenge is to help find out exactly where in the middle it lies, because I don’t know of any cryptanalysis of this specific kind of cipher.

NP: It’s not Vignere in so far as each alphabet isn’t a shift, but isn’t it still related – or at least based on the same principle (I had an idea there was a group of ciphers called “Quagmire” that were based on this sort multiple-alphabet encryption – and I thought there had been considerable analysis on them, and there were tools to try to analyse and crack them).

There’s a (free) Security Engineering course offered by UNSW on openlearning (google: “UNSW open Learning Security” shouild be the first result) that had a similar challenge (albeit a lot longer). (If you sign up free, go to “Lectures and Activities”, “Module 3”, “Cipher Challenge 3”.

It took me (and by the look of it most others) a LONG time to solve, but given the number of people who have solved it, it was by no means impossible (without knowing the number of keys etc – in fact there were no clues (formally, actually there was a big clue to the content, but I’m not sure it was noticed by most people until afterward), just a ciphertext). Granted, the text is SIGNIFICANTLY longer, but I think most people got it with a combination of analysis (looking for repeated Di/Tri/+ graphs and trying to work out a likely key space, then frequency analysis within each interval and finally guessing words (some with automated dictionary brute forcing, I think)….

And I think many (most even?) of the people who managed to solve it would have had a limited crypto background – if any.

I like 509 as T.

In fact I’m going to go out on a limb and say:

509 = ‘T’, 114 = ‘H’, 209 = ‘E’, 309=’S’, 215 = ‘I’, 419 is ‘E’

313 occurs a lot and might be another E….but I’ll reserve judgement for the sec

115 occurs 3 times each time followed by a different char, but twice has 313 2 char later

Guesswork and gut feel mainly. Might have a deeper look at it when I have time…..it interests me, but I’m time poor.

Helen Fouche Gaines – Elementary Cryptanalysis (c)1939, Chapter XVIII Periodic Ciphers with Mixed Alphabets. Still difficult, but some good tricks 🙂

Marie: fantastic, thanks! Could I possibly ask you for a scan of the chapter that I can summarize as a blog post? I’d like to give people the best chance of winning my money. 🙂

Marie: pdf downloadable here: http://informatika.stei.itb.ac.id/~rinaldi.munir/Kriptografi/2010-2011/cryptanalysis.pdf

Alas, Gaines restricts her analysis (if I understand it correctly) to mixed alphabets where the alphabets are derived from a single alphabet, i.e “sliders”, because “

It is very seldom indeed that a series of cipher alphabets used in the same cryptogram will be unrelated alphabets. Nearly always, they will have resulted from the use of a slide“. (p.175)Still, her general points about solving ciphers in Chapter IX (that she briefly reprises in this chapter) offer a very sound set of steps to be taking here, so I’ll write them up from that viewpoint, thanks! 🙂

Sorry, meant to respond sooner. I had found it online about a year ago and was going to send you the whole thing but it was the holidays here in the States and I got distracted, so you beat me to it! The book is complicated at times but a general fount of knowledge on all things cipher, with fun examples to work through. I had noticed the slide when I reread the polyalphabetic chapter, but there are still hints there and in many chapters as you’ve seen. A good example and one of my favorites is the Nihilist Transposition Chapter IV, both from my thoughts on D’Agapeyeff’s cipher (where she does, in fact, mention it being the reverse of a typical transposition though I would still love to find the original source of that information, but that is a post for another thread) and a hint on average vowel usage….

Marie: 🙂

Right now I’m re-reading Friedman’s guide to solving simple substitution. ciphers, lots of good stuff in there too. 🙂