Even if (and I would not disagree that it’s a big ‘if’) we accept that the 14×14 rearrangement of the d’Agapeyeff Challenge Cipher’s Polybius square output is a staging point on the reconstructive road back to the original plaintext, we’re still left with an unknown transposition of an unknown substitution. Which is not great. 🙁

However, what struck me this morning was that if d’Agapeyeff used a known text as the plaintext AND that plaintext was in Project Gutenberg, we could perhaps try using Big Data techniques to find the best matching frequency distribution of any consecutive 196 characters.

In practical terms. the idea would be to do the following for all of the Project Gutenberg texts:

  • transform them into pure text versions (i.e. A-Z only)
  • frequency count each consecutive block of 196 characters
  • sort that block’s frequency count
  • compare that sorted frequency count against the sorted frequency count of the d’Agapeyeff 14×14
  • display the 100 blocks ‘closest’ to the d’Agapeyeff 14×14

At the very least, the specific kind of passages this search highlights might well yield some insight into what is going on under the hood. Might be a bit of fun for a Hadoop person to try?

PS: the 14×14 d’Agapeyeff staging point looks like this:

    JBLOPBPDKDPION
    DIILNMKCKKIILB
    DJMLNPJIEMJJJR
    CEEKCKJOJJDBLQ
    OICLJIMKEKNODO
    DOOCLGBMBKKGKD
    CJLKDMCLOKCCCX
    IKPPNCONEDOEBS
    BBOPOPIPGJDEJF
    EMBDIKLNBLDPKR
    EBDNNPMOIPKEGI
    MMOLMDBGBEBMJQ
    GCLLGGMLONJLKM
    GNBLMJKDJIOKBQ

The frequency distribution for this is:

K  B  J  L  O  D  M  I  C  P  E  N  G Q R F S X A H T U V W Y Z
20 17 17 17 17 16 15 14 12 12 11 11 9 3 2 1 1 1 0 0 0 0 0 0 0 0

Normally for a challenge cipher in an English cipher book, you’d start by guessing that ‘K’ maps to plaintext ‘e’; that ‘BJLO’ map to (some combination of) plaintext ‘taio’ (or similar); and then try to make up the rest. The problem here is that because we’re apparently dealing with a substitution AND a transposition, we don’t have that cryptological luxury.

Yet if it turns out that the best frequency distribution matches are all from Shakespeare, this might give us a very strong hint as to where to look for the plaintext. Just a thought! 🙂

A little while back, I had a email from Marie about Alexander d’Agapeyeff’s (1939) book “Codes and Ciphers”, highlighting some interesting mistakes she had found in his section on double transposition cipher.

D’Agapeyeff described this as a cipher system that the Russian Nihilists had used, but said that they had used the same keyword for both halves of the transposition (i.e. for transposing both the columns and the rows), a technical flaw that made it easy to crack. (Oddly, the Nihilists are nowadays associated with an entirely different kind of encipherment.)

Let’s take a closer look…

D’Agapeyeff’s Double Transposition

What follows is d’Agapeyeff’s account, with comments along the way.

At the end of the nineteenth century the Russian Nihilists used a double cipher, which, having been transposed vertically, was then transposed horizontally; but they made the mistake of using the same keyword in both transpositions. As it is a common variation of double columnar cipher, we give it as an example:

The first thing that Marie picked up on was that the way that d’Agapeyeff converted the transposition keyword SCHUVALOF to an ordering was clearly incorrect: F is the sixth letter of the alphabet, so there is no obvious way that it would be counted as the highest ranked of the nine letters in the keyword. When I looked at this, I immediately guessed that it should instead have read SCHUVALOV – as it turned out, this was a good try, though still very slightly wrong. 😐

Regardless, it should already be clear that something a little non-obvious is going on here.

Now suppose we have to encipher the following: ‘Reunion to-morrow at three p.m. Bring arms as we shall attempt to bomb the railway station. Chief.’

The ‘abcd’ at the end are ‘nulls’ used to fill in the squares.

Now we transpose the message according to the letter sequence of the keyword:

So the message reads:

OMPBOETTMWORATMROTMREBRHEPIATHILBERWTYSIOATANOEUNTRNIOSGAASNRMWLSHATEALTAHIBCCEFD

In all languages where certain letters must follow or precede certain others, the deciphering of this script will never present difficulties. We first count the number of letters in the script (81), which will give us the size of the square (9×9), and once this is done all we have to do is remember that in nine cases out of ten ‘h’ follows either ‘t’ or ‘s’ or ‘c’, and that the bigrams such as AT, TO, WE and the very helpful (English) trigram ‘the’, and the doubles TT, LL, EE, etc., are the most common. In fact, the Russian police soon found out all about that conspiracy.

The second thing Marie noted here was that d’Agapeyeff was using the double transposition decryption direction here, rather than the encryption direction.

All in all, I’d agree with Marie that d’Agapeyeff didn’t seem to have fully understood how the system worked. Smartly, though, Marie now doggedly decided to look at d’Agapeyeff’s crypto sources, to see if he had copied this whole section blindly from somewhere. And, eventually, she found that d’Agapeyeff’s direct source for the above was none other than…

Auguste Kerckhoffs

…the Dutch cryptographer Auguste Kerckhoffs (1835-1903).

Kerckhoffs’ influential book (well, extended article, really) “La Cryptographie Militaire” is available online as a PDF, or as an HTMLized version here.

What follows is my usual free translation of Kerckhoffs’ description of double transposition, which we can immediately see beyond any reasonable doubt as being the source for d’Agapeyeff’s version:

On the occasion of the Nihilists’ last appearance in court, the Russian newspapers published the accused’s secret cipher. It is a system of double transposition, where the letters are first transposed by vertical columns, and are then further transposed by horizontal rows. The same word serves as a key for both transpositions: to do this, the keyword is transformed into a series of numbers, where each number matches the rank of the letter within the normal alphabetical sequence.

Here is the process applied to the word SCHUVALOW:

OK, though I was on this occasion very slightly wrong (SCHUVALOV rather than SCHUVALOW), I was at least wrong in the right kind of way. 🙂 Kerckhoffs continues:

Now, if we were to transpose a sentence such as this one – Vous êtes invité à vous trouver ce soir, à onze heures précises, au local habituel de nos réunions – we would proceed first as in the previously described [single transposition] case, and then carry out the same operation for the horizontal rows.

   = s c i a u e s e l a v i v o n t e u v t r e r s o u c a c a b i o l h t n e l o s u d e r, etc.

However complicated this transposition may appear to us, deciphering a cryptogram written with this system, can never present insurmountable difficulties in languages ​​where certain letters only present themselves in particular combinations, such as q or x in French. Here, the Russian decipherers seem to have carried out their decryption work in a relatively short time.

For any passing conlang fans, Auguste Kerckhoffs was also closely associated with the artificial language Volapük, which some people think is really koldälik. 🙂

d’Agapeyeff + Kerckhoffs = …?

It’s important to remember that d’Agapeyeff wasn’t himself a cryptographer, but rather someone who was trying to collect together interesting crypto stuff into a book that had originally been commissioned for someone else entirely to write. The project wasn’t something he was aiming to do, but rather something that fell in his lap.

As Marie points out, the big technical thing that d’Agapeyeff got wrong is that the numbers are the wrong way round, and so he is performing a double transposition decryption rather than a double transposition encryption: the two are not the same at all. That is, if you used SCHUVALOW as your single transposition keyword and then single transposition encrypted the text “SCHUVALOW”, you should get the ciphertext “ACHLOSUVW”: but both Kerckhoffs and d’Agapeyeff (copying Kerckhoffs) seem to have got this the wrong way round.

Having thought about this for a little while, I’ve come to suspect that d’Agapeyeff may well have faultily believed that double transposition was a self-inverse process, i.e. where the decryption and encryption transformations are identical.

All of which would dovetail very neatly indeed with the report that we have that he was unable to decrypt his own challenge cipher: for if he (wrongly) believed that double transposition was self-inverse, then he wouldn’t (if his challenge cipher had used double transposition) have been able to decrypt it at all. If this is correct, then his failure wasn’t anything as foolish as misremembering the keyword, but instead misunderstanding one of the component ciphers that made up the overall chain.

Might this insight help us decrypt his challenge cipher? Well… insofar as it now seems far more likely to me that he used double transposition as one of his stages, then the answer may very well be yes. Hopefully we shall see… 🙂

For those of you who have had their fill of the last week’s posts on the Somerton Man, here’s a different cipher mystery that doesn’t get aired even 1% as much: the Feynman Ciphers.

The first Feynman Cipher (F1, 380 characters long) turned out to be based on a 5 x 76 transposition path cipher (the plaintext was “WHANTHATAPRILLEWITHHISSHOURESSOOTE”, i.e the start of Chaucer’s Canterbury Tales), but what is a little odd is that nobody seems to have yet made any inroads at all into the other two, though it is often remarked that transposition may well be involved. In that sense, they’re a bit like the d’Agapeyeff challenge cipher, which is also believed to be a multi-stage cipher including one or more transposition stages.

At 261 characters long, the second Feynman Cipher (F2) is a little shorter than F1: this length factorizes to 3 x 3 x 29, or 9 x 29, or 3 x 87. It also includes all 26 letters, which rules out a lot of tricky ciphers such as Playfair and Phillips.

XUKEXWSLZJUAXUNKIGWFSOZRAWURORKXAOS
LHROBXBTKCMUWDVPTFBLMKEFVWMUXTVTWUI
DDJVZKBRMCWOIWYDXMLUFPVSHAGSVWUFWOR
CWUIDUJCNVTTBERTUNOJUZHVTWKORSVRZSV
VFSQXOCMUWPYTRLGBMCYPOJCLRIYTVFCCMU
WUFPOXCNMCIWMSKPXEDLYIQKDJWIWCJUMVR
CJUMVRKXWURKPSEEIWZVXULEIOETOOFWKBI
UXPXUGOWLFPWUSCH

Though normally very good at identifying cipher types, Cryptocrack doesn’t do particularly well in this: it suggests Phillips, FracMorse, Playfair and Beaufort as its top four tips, none of which seem hugely likely to me. What is interesting, though, is that if you transpose the ciphertext (say, using some of the seven transposed routes listed by James Lyons), Cryptocrack produces a quite different set of recommendations, suggesting instead Trifid (which it almost certainly isn’t), but more reasonably Running Key and occasionally Vigenere.

Personally, I don’t think it’s a Vig: so right now, my prediction is that it’ll turn out to be a funky path transposition combined with Running Key (combining this with Vigenere would surely be just a bit too sadistic). Perhaps this will be what James Lyons will say too, when he gets round to posting part 3 (his part 2 is here.

Finally: the third Feynman Cipher (F3) is short too: 231 characters, which factorizes to 3 x 7 x 11. Much as James Lyons notes, I currently expect more or less everything said about F2 to hold true for F3: so I epxect it’s probably a Running Key (or perhaps Vigenere, but I doubt it) combined with a funky path transposition.

WURVFXGJYTHEIZXSQXOBGSVRUDOOJXATBKT
ARVIXPYTMYABMVUFXPXKUJVPLSDVTGNGOSI
GLWURPKFCVGELLRNNGLPYTFVTPXAJOSCWRO
DORWNWSICLFKEMOTGJYCRRAOJVNTODVMNSQ
IVICRBICRUDCSKXYPDMDROJUZICRVFWXIFP
XIVVIEPYTDOIAVRBOOXWRAKPSZXTZKVROSW
CRCFVEESOLWKTOBXAUXVB

What do you think?

A pragmatic starting point for the d’Agapeyeff cipher is to sequentially replace its digit pairs with letters, i.e.

** .1 .2 .3 .4 .5
6. _0 17 12 16 11 --> A B C D E
7. _1 _9 _0 14 17 --> F G H I J
8. 20 17 15 11 17 --> K L M N O
9. 12 _3 _2 _1 _0 --> P Q R S T
0. _0 _0 _0 _1 _0 --> U V W X Y

If you then “re-flow” those letters into a 14×14 grid, many of its oddities are to be found in the final right hand column:-

[ 0] J B L O P B P D K D P I O N
[ 1] D I I L N M K C K K I I L B
[ 2] D J M L N P J I E M J J J R
[ 3] C E E K C K J O J J D B L Q
[ 4] O I C L J I M K E K N O D O
[ 5] D O O C L G B M B K K G K D
[ 6] C J L K D M C L O K C C C X
[ 7] I K P P N C O N E D O E B S
[ 8] B B O P O P I P G J D E J F
[ 9] E M B D I K L N B L D P K R
[10] E B D N N P M O I P K E G I
[11] M M O L M D B G B E B M J Q
[12] G C L L G G M L O N J L K M
[13] G N B L M J K D J I O K B Q

The ‘X’ (’04’) right at the end of row #6 is highly suspicious: at least one person before me has suspected that this might somehow be a padding ‘X’ appended to the end of the (pre-transposition-stage) plaintext to bring it up to a 14×14 multiple.

However, I think that the three ‘Q’ (’92’) symbols in the same rightmost column are even more suspicious: this symbol occurs exactly three times in the cryptogram, and only ever in this column. I think these are even more likely than the ‘X’ to be the final three letters of the plaintext, appended to pad it up to 14×14 = 196 characters in length.

In fact, I’m now almost certain that the correct starting point for cryptanalysis should be the diagonal transposition of the 14×14 grid, which transformation would flip all these oddities across onto the bottom (final) row of the transposed grid, leaving (presumably) a 14-column transposition cipher to solve:-

[ 0'] J D D C O D C I B E E M G G
[ 1'] B I J E I O J K B M B M C N
[ 2'] L I M E C O L P O B D O L B
[ 3'] O L L K L C K P P D N L L L
[ 4'] P N N C J L D N O I N M G M
[ 5'] B M P K I G M C P K P D G J
[ 6'] P K J J M B C O I L M B M K
[ 7'] D C I O K M L N P N O G L D
[ 8'] K K E J E B O E G B I B O J
[ 9'] D K M J K K K D J L P E N I
[10'] P I J D N K C O D D K B J O
[11'] I I J B O G C E E P E M L K
[12'] O L J L D K C B J K G J K B
[13'] N B R Q O D X S F R I Q M Q

Here I’ve highlighted the two tripled letters (“LLL” on row #3′, and “KKK” on row #9′): here LLL is on a row with 6 L’s (so it’s hardly surprising that it ended up as a tripled letter post-transposition), while KKK is on a row with 4 K’s. Here are the overall letter instance counts for the cryptogram:-

.K .B .J .L .O .D .M .I .C .P .E .N .G .Q .R .F .S .X
20 17 17 17 17 16 15 14 12 12 11 11 9 3 2 1 1 1

It’s interesting to compare this set with the letter frequency table of the text mini-corpus taken from d’Agapeyeff’s “Codes and Ciphers” (which I also generated recently). If you normalize that to 196 characters, here’s what you would expect to see in the cryptogram:-

.E .T .A .I .O .S .N .R .H .D .L .C .U .M .F .P .W .G .Y .B .V .K
25 18 15 14 14 14 13 12 11 .8 .7 .6 .5 .4 .4 .4 .4 .4 .3 .3 .2 .1

From this, it looks as though K probably –> E, while B/J/L/O/D seem likely to go to T/A/I/O/S. What I’m thinking here is that if this is right, all we need to solve it is to generate a moderate number of best-guess substitution values and feed those into a transposition cipher solver, i.e.:-
(a) guess that K –> E
(b) generate the 5! = 120 permutations of B/J/L/O/D –> T/A/I/O/S
(c) assign plausible values to the remainder of the used letters (in matching descending frequency order)
(d) feed the 120 versions of the transposed 14×14 grid into a reliable columnar transposition solver

My prediction is that even though this will still be wrong, getting the 6 most popular letters right (i.e. 20 + 17 + 17 + 17 + 17 + 16 = 104 characters, ~53% of the cryptogram) and possibly some of the others (by chance) will allow the transposition solver to get us close enough to the answer, that we can tell from its output what the correct transposition order is. Does that sound reasonable?

PS: if the final row is partly artificial, it may be a good idea not to feed that into the transposition solver, i.e. only try to solve a 14×13. Incidentally, a very good freeware cipher solver Windows application is CryptoCrack, but more about that another day… 🙂

I’ve been trying to break the d’Agapeyeff challenge cipher this week, a process that I (along with several other cipher commentators, although opinions differ etc etc) strongly suspect will involve solving a 14×14 transposition cipher and a substitution cipher simultaneously.

A plausible-sounding way to try to do this would be to model the distribution of digraph frequency counts in English texts, and then for a given transposition compare an ordered table of its digraph frequency counts against that model. However, when I tried this with some test text (taken from d’Agapeyeff’s book), the English digraph frequency values given on the Internet weren’t even close.

I initially looked at getting a corpus of British English text to generate a proper digraph frequency table: but that proved to be difficult and expensive, with the bother of licenses and licence fees to deal with. But then I thought… why not use d’Agapeyeff’s book “Codes and Ciphers” itself as the corpus? Sure, it’s on a much smaller scale, but it would surely be more statistically representative of the cryptogram’s plaintext than the complete works of Shakespeare (which are often included in English corpora, presumably on the principle of what-the-heck-let’s-throw-it-all-in-can-it-really-hurt?).

Even though the book’s text looked nice and clean to my eye, OCR’ing it turned out to be completely unsatisfactory: and so I was delighted to find a page put up by regular Cipher Mysteries commenter Menno Knul containing a lot of text from “Codes and Ciphers” (thanks Menno!). After a bit of tweaking (fixing some typos, removing foreign language quotes, removing confusing cipher / code passages, etc), I then ended up with a reasonably workable d’Agapeyeff mini-corpus to plug into a trivial C digraph-counting programme.

So, here are d’Agapeyeff’s top 50 digraphs from the text of “Codes and Ciphers” (but with spaces, punctuation, spaces and numbers removed), together with their frequency percentages in descending order. I’ll be using this table before very long to try to break his cipher, fingers crossed they’ll do the trick!

TH,3.23744%
HE,2.80072%
IN,2.03171%
ER,1.89246%
AN,1.50321%
ES,1.41460%
RE,1.39245%
ON,1.21523%
NT,1.20257%
ED,1.20257%
ST,1.19624%
EN,1.14561%
SE,1.10447%
EA,1.08548%
TE,1.05383%
TI,1.04750%
ET,1.02851%
ND,1.01269%
IS,0.99370%
OF,0.98104%
TO,0.95889%
OR,0.94940%
AT,0.92725%
AS,0.92725%
IT,0.87978%
HI,0.83547%
LE,0.82598%
NG,0.81648%
AL,0.81648%
HA,0.80699%
AR,0.80699%
SA,0.73104%
SI,0.71838%
VE,0.70255%
RI,0.69623%
CO,0.69306%
SO,0.68673%
ME,0.67724%
EC,0.67407%
DE,0.66774%
RA,0.60129%
RS,0.59496%
RO,0.59179%
DI,0.59179%
TT,0.58546%
OU,0.58546%
TA,0.58230%
BE,0.57597%
US,0.54432%
IC,0.52850%

Just a quicky adminy post, lots of loose nails all needing tapping in, you know how it is.

(1) Happy New Year to you all! (…unless associating happiness with a particular Western calendar is far too politically incorrect for you, in which case I don’t really know what to say).

(2) A big Thanks! going out from me to all you Cipher Mysteries visitors and subscribers, as the blog has now had 600K visitors and more than a million page hits! I know it’s just a number, but it’s a nice number and it’s mine.

(3) Right now, I’m optimistic that 2014 will be a genuinely productive year for cracking cipher mysteries – specifically, I’m more positive about the Somerton Man and d’Agapeyeff Cipher than I have been for a long time. I’ve also got lots of interesting posts planned, though Real Life will inevitably intrude on the fantasy image of Cipher Mysteries Mansions that people seem to have, hence these will doubtless take me much longer than I expect.

(4) I’ve just yesterday sold my last copy of the most recent batch of The Curse Of The Voynich, but another box of books has already been ordered and so the price of copies on Amazon Marketplace etc should be back to a practical level (i.e. not £123, ha!) within a fortnight or so.

(5) I’ve recently published a fun interactive chess ebook some of you may like, called “Chess Superminiatures” (on Amazon UK, Amazon US, etc. It contains a hundred really super miniature chess games (all under ten moves!) and a whole load of what-happened-next mini-quizzes, which are particularly good for trains or planes. If you have a Kindle and like a little bit of chess, I’m pretty sure you’ll enjoy this! 🙂

(6) At the moment, I’m vaguely thinking about arranging a Cipher Mysteries pub meet some time in February, so if any of you are aiming to be visiting London around then, drop me a line and I’ll see if I can arrange it to coincide: it’s always nice when we manage to make that happen.

As is well known, Alexander d’Agapeyeff’s 1939 challenge cipher looks like this:-

75628 28591 62916 48164 91748 58464 74748 28483 81638 18174
74826 26475 83828 49175 74658 37575 75936 36565 81638 17585
75756 46282 92857 46382 75748 38165 81848 56485 64858 56382
72628 36281 81728 16463 75828 16483 63828 58163 63630 47481
91918 46385 84656 48565 62946 26285 91859 17491 72756 46575
71658 36264 74818 28462 82649 18193 65626 48484 91838 57491
81657 27483 83858 28364 62726 26562 83759 27263 82827 27283
82858 47582 81837 28462 82837 58164 75748 58162 92000

Almost all cryptanalyses of this ciphertext start from the reasonable observation that (a) this is dominated by number-pairs of the form [6/7/8/9/0][1.2.3.4.5], and that (b) these pairs have a very strongly language-like distribution:-

** .1 .2 .3 .4 .5
6. _0 17 12 16 11
7. _1 _9 _0 14 17
8. 20 17 15 11 17
9. 12 _3 _2 _1 _0
0. _0 _0 _0 _1 _0

To simplify discussion (and ignoring the issue of fractionation for the moment), we can assign these structured number pairs to letters in an obvious sort of order, e.g.:-

** .1 .2 .3 .4 .5
6. _A _B _C _D _E
7. _F _G _H _I _J
8. _K _L _M _N _O
9. _P _Q _R _S _T
0. _U _V _W _X _Y

In which case, the rather less verbose version of the same ciphertext would look like this:-

J B L O P B P D K D P I O N D I I L N M
K C K K I I L B D J M L N P J I E M J J
J R C E E K C K J O J J D B L Q O I C L
J I M K E K N O D O D O O C L G B M B K
K G K D C J L K D M C L O K C C C X I K
P P N C O N E D O E B S B B O P O P I P
G J D E J F E M B D I K L N B L D P K R
E B D N N P M O I P K E G I M M O L M D
B G B E B M J Q G C L L G G M L O N J L
K M G N B L M J K D J I O K B Q - - - -

Cryptanalytically, the problem with this as a ciphertext is that, even discounting the ’00’ filler-style characters at the end, it simply has too many doubled (and indeed tripled) letters to be simple English: 11 doublets, plus 2 additional triplets. Hence the chance of any given letter in this text being followed by itself within this text is 15/195 ~= 7.7%, while the chance of any given letter being followed by itself twice more is 2/194 ~= 1.03%.

According to my spreadsheet, if the letters were jumbled randomly, the chances of the same letter appearing twice in a row would be 7.44% (very slightly less than what we see, but still broadly the same), while the chances of the same letter appearing three times in a row would 0.594% (quite a lot less).

It struck me that these statistics might possibly be what we might expect to see for texts formed of every second or every third letter of English. So, I decided to test this notion with some brief tests on Moby Dick:-

Distance – doubles – triples
1 – 3.693% – 0.075%
2 – 4.426% – 0.275%
3 – 5.994% – 0.476%
4 – 6.289% – 0.466%
5 – 6.682% – 0.566%
6 – 6.491% – 0.546%
7 – 6.372% – 0.508%
8 – 6.536% – 0.533%
9 – 6.544% – 0.536%
n – 6.524% – 0.525% (i.e. predicted percentages based purely on frequency counts)

Indeed, what we see is that the probability of a triple letter occurring in the actual text starts very low (0.075%), but rises to close to the raw probability (from pure frequency counts) at a distance of about 5 (i.e. A….B….C….D…. etc).

So, comparing the actual triple letter count in the ciphertext with the ciphertext’s raw frequencies would seem to suggest a transposition step of about 2 is active, whereas comparing the double count in the ciphertext with the ciphertext’s raw frequencies would seem to suggest a transposition step of about 5 is active.

Yes, I know that this looks a bit paradoxical: but it is what is. Still workin’ on it…

To add to our list of challenge ciphers (Bellaso’s, d’Agapeyeff’s, Feynman’s, etc), here’s one I hadn’t seen before from Helen Fouché Gaines’ (1956) “Cryptanalysis: A Study of Ciphers and Their Solution”, which I found courtesy of Greg Ross’s Futility Closet website:-

VQBUP PVSPG GFPNU EDOKD XHEWT IYCLK XRZAP
VUFSA WEMUX GPNIV QJMNJ JNIZY KBPNF RRHTB
WWNUQ JAJGJ FHADQ LQMFL XRGGW UGWVZ GKFBC
MPXKE KQCQQ LBODO QJVEL.

The cipher is the last in a series of exercises at the end of a chapter titled “Investigating the Unknown Cipher,” and she gives no hint as to its source. Of the exercises, she writes, “There is none in which the system may not be learned through analysis, unless perhaps the final unnumbered cryptogram.” The solution says simply “Unsolved.”

If you look at the book itself (p.217), all Gaines says is “Here is one which nobody has been able to decrypt:“. Hence it is not at all clear whether this is a composed challenge cipher (i.e. designed to confound) or an accidental challenge cipher (i.e. one found in the wild but never yet solved). I suspect the latter… but perhaps someone will know for sure either way.

Incidentally, the 1968 comment on this mentioned in the Futility Closet post is online here (it’s on p.5): just so you know, the authors there offer an [entirely fictional, I expect] “Nicodemus J. Grumbow award” for anyone solving it.

As far as the ciphertext itself goes, it has a flattish distribution (Q appears 9 times, while T & Y appear only twice each, all 26 letters are used), with a standard deviation of 1.52144, i.e. much flatter than a normal alphabet would present.

It has no repeated trigrams, while QJ & PN appear three times (DO, GW, QL, GG, VQ, PV, NU, NI and XR each appear twice). There are seven doubled letter-pairs, all appearing once only each (PP, GG, JJ, RR, WW, GG, QQ). There are a few visible patterns in the text that vaguely suggest some kind of structuring (JAJGJ, QCQQ, QLQ and QQL), but all of which might just be random.

As a result, it doesn’t appear to be a monoalphabetic substitution, nor a (conventional) polyalphabetic substitution (as there seems to be no obvious cycles, loops, or repeats). The cipher text is 125 characters long, which (as a mathematician) makes me idly wonder whether this was partly enciphered using some kind of a 5x5x5 three-dimensional transposition cipher, the sort of thing a Bond villain would gloat about in his/her evil monologue. I don’t believe for a minute that this is the case, of course, but I thought I’d mention it all the same. 🙂

Any thoughts? Is there anything that suggests to you what kind of a cipher this might be?

Just a quick note to say that I’ve been working behind the scenes for a few weeks on a revised Cipher Mysteries home page, incorporating a nice clickable list of what I think are the top unsolved cipher mysteries of all time, some of which you may not have heard of:-

  1. (–Top secret, yet to be announced–)
  2. The Voynich Manuscript
  3. The Anthon Transcript
  4. The Beale Papers
  5. The Rohonc Codex
  6. The HMAS Sydney Ciphers
  7. The Tamam Shud Cipher
  8. The D’Agapeyeff Cipher
  9. The Codex Seraphinianus
  10. The Dorabella Cipher
  11. The Phaistos Disk

Note that the HMAS Sydney Ciphers part isn’t yet live, because I haven’t written the post yet (probably later this week). 🙂  I may update the list later to insert the Vinland Map at #7, but that’s another story entirely…

Incidentally, the reason I ranked the Voynich Manuscript at #2 is because the top spot will be filled (hopefully fairly soon) with an awesome centuries-old cipher mystery I’ve been chipping away at for a while, one that will be eerily familiar to many CM readers. Don’t hold your breath, but I do think you’re going to like it a lot… 🙂

Not much to say about Tiago Rodrigues’ new d’Agapeyeff cipher site, except that he summarises the existing cipher analyses pretty well and adds a few of his own thoughts (such as splitting the search space into two 7×14 blocks to try to make it significantly more manageable). Apart from wanting him to change his outbound links from the Voynich News blogger site to the Cipher Mysteries WordPress site, I’d say he’s done a good job of putting together a pretty good starting point. 🙂

(Incidentally, my three d’Agapeyeff pages are here, here, and here.)