A pragmatic starting point for the d’Agapeyeff cipher is to sequentially replace its digit pairs with letters, i.e.

** .1 .2 .3 .4 .5
6. _0 17 12 16 11 --> A B C D E
7. _1 _9 _0 14 17 --> F G H I J
8. 20 17 15 11 17 --> K L M N O
9. 12 _3 _2 _1 _0 --> P Q R S T
0. _0 _0 _0 _1 _0 --> U V W X Y

If you then “re-flow” those letters into a 14×14 grid, many of its oddities are to be found in the final right hand column:-

[ 0] J B L O P B P D K D P I O N
[ 1] D I I L N M K C K K I I L B
[ 2] D J M L N P J I E M J J J R
[ 3] C E E K C K J O J J D B L Q
[ 4] O I C L J I M K E K N O D O
[ 5] D O O C L G B M B K K G K D
[ 6] C J L K D M C L O K C C C X
[ 7] I K P P N C O N E D O E B S
[ 8] B B O P O P I P G J D E J F
[ 9] E M B D I K L N B L D P K R
[10] E B D N N P M O I P K E G I
[11] M M O L M D B G B E B M J Q
[12] G C L L G G M L O N J L K M
[13] G N B L M J K D J I O K B Q

The ‘X’ (’04’) right at the end of row #6 is highly suspicious: at least one person before me has suspected that this might somehow be a padding ‘X’ appended to the end of the (pre-transposition-stage) plaintext to bring it up to a 14×14 multiple.

However, I think that the three ‘Q’ (’92’) symbols in the same rightmost column are even more suspicious: this symbol occurs exactly three times in the cryptogram, and only ever in this column. I think these are even more likely than the ‘X’ to be the final three letters of the plaintext, appended to pad it up to 14×14 = 196 characters in length.

In fact, I’m now almost certain that the correct starting point for cryptanalysis should be the diagonal transposition of the 14×14 grid, which transformation would flip all these oddities across onto the bottom (final) row of the transposed grid, leaving (presumably) a 14-column transposition cipher to solve:-

[ 0'] J D D C O D C I B E E M G G
[ 1'] B I J E I O J K B M B M C N
[ 2'] L I M E C O L P O B D O L B
[ 3'] O L L K L C K P P D N L L L
[ 4'] P N N C J L D N O I N M G M
[ 5'] B M P K I G M C P K P D G J
[ 6'] P K J J M B C O I L M B M K
[ 7'] D C I O K M L N P N O G L D
[ 8'] K K E J E B O E G B I B O J
[ 9'] D K M J K K K D J L P E N I
[10'] P I J D N K C O D D K B J O
[11'] I I J B O G C E E P E M L K
[12'] O L J L D K C B J K G J K B
[13'] N B R Q O D X S F R I Q M Q

Here I’ve highlighted the two tripled letters (“LLL” on row #3′, and “KKK” on row #9′): here LLL is on a row with 6 L’s (so it’s hardly surprising that it ended up as a tripled letter post-transposition), while KKK is on a row with 4 K’s. Here are the overall letter instance counts for the cryptogram:-

.K .B .J .L .O .D .M .I .C .P .E .N .G .Q .R .F .S .X
20 17 17 17 17 16 15 14 12 12 11 11 9 3 2 1 1 1

It’s interesting to compare this set with the letter frequency table of the text mini-corpus taken from d’Agapeyeff’s “Codes and Ciphers” (which I also generated recently). If you normalize that to 196 characters, here’s what you would expect to see in the cryptogram:-

.E .T .A .I .O .S .N .R .H .D .L .C .U .M .F .P .W .G .Y .B .V .K
25 18 15 14 14 14 13 12 11 .8 .7 .6 .5 .4 .4 .4 .4 .4 .3 .3 .2 .1

From this, it looks as though K probably –> E, while B/J/L/O/D seem likely to go to T/A/I/O/S. What I’m thinking here is that if this is right, all we need to solve it is to generate a moderate number of best-guess substitution values and feed those into a transposition cipher solver, i.e.:-
(a) guess that K –> E
(b) generate the 5! = 120 permutations of B/J/L/O/D –> T/A/I/O/S
(c) assign plausible values to the remainder of the used letters (in matching descending frequency order)
(d) feed the 120 versions of the transposed 14×14 grid into a reliable columnar transposition solver

My prediction is that even though this will still be wrong, getting the 6 most popular letters right (i.e. 20 + 17 + 17 + 17 + 17 + 16 = 104 characters, ~53% of the cryptogram) and possibly some of the others (by chance) will allow the transposition solver to get us close enough to the answer, that we can tell from its output what the correct transposition order is. Does that sound reasonable?

PS: if the final row is partly artificial, it may be a good idea not to feed that into the transposition solver, i.e. only try to solve a 14×13. Incidentally, a very good freeware cipher solver Windows application is CryptoCrack, but more about that another day… 🙂

7 thoughts on “Why I think the d’Agapeyeff cipher is diagonally transposed…”

1. Jim Melichar on January 28, 2014 at 3:58 pm said:

Nick, before you hit all the columnar transpositions, I still think it would be worth your time to remove the other square fills and extractions from the consideration set. Since there’s no columnar transposition required you can simply “put the ciphertext back into the 14×14 square in one pattern” and “read it” using any other pattern.

There are 48 patterns in total, therefore if you assume a pattern was used to write the ciphertext into the square (let’s say diagonal) and another was used to read it out of the square (let’s say columnar), you can quickly check those 48 * 47 permutations for plaintext.

If you’d like, I have some C# code that has the patterns for the 48 ways you can fill a square. I’m sure you can quickly mod it to C.

Jim

2. Jim: the odd stuff in the right hand column looks to me like a set of artefacts from the final line of a transposition, so I’m going to pursue that for the next few days. Unfortunately, CryptoCrack’s Incomplete Columnar solver currently doesn’t seem to handle width-14 transpositions (though it handles 10, 11, 12, 13, and 15 just fine, bah!), so I’ve asked the author if it could be extended to include 14 as well. Fingers crossed! 🙂

3. Jim Melichar on January 28, 2014 at 5:03 pm said:

Understood. Why the need to use an incomplete columnar solver? Just blast the entire last line away. Log tetragraph scoring will find the solution to the monoalphabetic substitution via hillclimbing with 182 characters or 196.

4. Jim: unless you know CryptoCrack better (I certainly couldn’t see anything in the CryptoCrack documentation), the Incomplete Columnar solver seems to do some kind of hill-climbing and terminates early, while the Complete Columnar seems to brute force it (which isn’t a lot of practical use when faced with 14! permutations).

5. Jim Melichar on January 28, 2014 at 5:29 pm said:

If they don’t add 14, I can code something up to test it as well.

Since most of my work has been on fractionated ciphers, I’d just need to modify it a little. I build as I need something 🙂

6. Jim: let’s hold back a few days, give the various requests a chance to percolate, see where it all lands. 🙂

7. Jim Melichar on January 30, 2014 at 1:47 am said:

Nick, I now realize what you mean by diagonal, which is different than what I consider diagonal (an extraction pattern from a square).

You’re contention is the plaintext was written horizontally and then removed from the square vertically via a key = width 14.

Got it.