Hans Jahr very kindly left a comment here on Cipher Mysteries recently, pointing me at a treasure trove of declassified NSA documents relating to William Friedman.

Some of these had already been declassified by different NSA mechanisms: but even so, there’s simply so much new material to go through here that it’s enough to make your head spin.

Of course, I’ve started trying to go through these (e.g. there’s not that much new on the Voynich Manuscript, but there’s a lot on the role of Typex II in the post-WW2 years), but it’s going to take a while. Please feel free to browse and search yourselves, and let me know if you stumble onto anything interesting. 🙂

Incidentally, given that Friedman evaluated the Chaocipher in 1942, there may well be some Chaocipher-related correspondence in there somewhere for Moshe Rubin to find. (There’s a large Excel spreadsheet briefly listing out all the items).

Speaking of whom, Moshe Rubin also very kindly dropped me a email about an NSA page containing recently declassified Beale Papers documents.

As you’d expect, many have dated badly and many others are of little use, but that still leaves many other interesting things in the heap. Browse and enjoy; and if you do happen to find something about four miles from Buford’s Tavern… 🙂

More generally, though, the NSA has posted up a list of declassified topics: this includes not only Friedman and Beale, but also VENONA, the death of John F. Kennedy, UFOs, etc etc.

Yet personally, I’m more interested by this unbelievably long list of declassified code/cipher stuff, which includes items relating to just about every country that ciphered anything in the first half of the 20th century. It took me half an hour to speed read through the titles alone!

I picked out a few choice items, firstly on Double Transposition (more on that in a few days’ time):-

NR 1479 CBKH37 6021A 19401200 TRAINING PAMPHLET NO. 40 DOUBLE TRANSPOSITION CIPHERS NUMERICAL SOLUTION
NR 1871 CBLI66 4262A 19340000 GENERAL SOLUTION FOR THE DOUBLE TRANSPOSITION CIPHER
NR 2017 CBLJ73 6206A 19450212 GERMAN POW REPORTS ON DOUBLE TRANSPOSITION CIPHER SYSTEM USED BY AMERICAN
NR 4608 ZEMA36 13947A 19430200 SIGRED-2 INSTRUCTIONS FOR USING DOUBLE TRANSPOSITION CIPHER

Next, some documents on secret writing (at least one person here likes that 🙂 ):

NR 3464 CBQM37 4000A 19200000 SECRET INKS
NR 3465 CBQM37 4852A 19430109 POSTAGE STAMP CODE – ANTHONY E. SCOTTINO CASE
NR 3468 CBQM37 863A 19431129 SECRET WRITING
NR 3469 CBQM37 896A 19430326 WORKSHEETS OF THE LAB BRANCH ON SECRET INK TESTS
NR 3470 CBQM37 897A 19430601 MISCELLANEOUS PAPERS ON POW CENSORSHIP AND SECRET WRITING
NR 3471 CBQM37 898A 19440331 LETTERS CONTAINING SECRET WRITING CONDEMNED BY LAB AFTER EXAMINATION

There’s a report on the Zimmerman Telegram:-

NR 1872 CBLI66 4263A 19380000 ZIMMERMAN TELEGRAM OF JANUARY 16, 1917 AND ITS CRYPTOGRAPHIC BACKGROUND

A few pages I’ll try to get around to looking at before very long:-

NR 3552 CBSC73 6101A 19430500 SECURITY OF ALLIED CIPHERS
NR 3857 ZEMA109 46120A 19431113 UNITED KINGDOM “EXAMPLES OF OPERATIONAL EXPERIENCE IN THE USE OF PIGEONS BY THE R.A.F.”
NR 4660 ZEMA42 4410A 19420120 INTERCEPT OF CARRIER PIGEON MESSAGE

And finally, a fair few photos:-

NR 4181 ZEMA169 41048A 19170000 PHOTOGRAPH. WILLIAM F. FRIEDMAN, SIS, ASA, NSA
NR 4195 ZEMA169 41111A 19420000 PHOTOGRAPH: BRIG. JOHN H. TILTMAN, UK
NR 4693 ZEMA46 41289A 19450000 PHOTOGRAPHS: EQUIPMENT – BOMBE

I just saw a nice little online article courtesy of Clara Chow of the Straits Times, riffing on a whole load of different unreadable books.

The trigger for her article is Iterating Grace, a short-story-sized illustrated bookette stunt clearly designed to mystify idiot tech startup thought leaders, particularly those with foolishly high opinions of themselves. (Errrm… that didn’t narrow it down half as much as I hoped. But never mind.)

Of course, this being the Internet and all, you only have to blink once before more stuff gets pulled unwillingly from the shadows into the light, in this case an alleged connection between “Iterating Grace” and an artist called Curtis Schreier, who long ago was part of a stunt-liking art collective called “Ant Farm”. And so it goes on.

Chow goes on to mention various vanity books (A. M. Monius’ philosophy book, a book by “Joe K” who is probably Swede Petter Nordlund), as well as Robin Sloan’s (2013) novel “Mr. Penumbra’s 24-Hour Bookstore”, which I haven’t yet read (but sounds entertaining).

She has the Book of Soyga wrong, though: Jim Reeds famously worked out its algorithmic workings a decade ago. And as for a certain academic linguist’s I-can-read-nine-Voynich-words-all-of-them-‘meh’ self-asserted decryption… the less said the better.

Chow had fun joining all the dots together: but I don’t think she really understands that we are deep into the Freemium decade, and that blogs and social meedja posts are often just famebait, trying to dig over some virtual field-shaped community to plant a carefully crafted fame seed in. Ultimately, is “Iterating Grace” any less vain than the preening dotcom vanity it lightly satirizes? I don’t think so, but feel free to have your own opinion.

As for the poor old Voynich Manuscript, people deliberately misgrasp that at every turn to serve their own ends, to the point that their misgraspingness isn’t even funny any more. It’s hard not to conclude that His Royal Baxness and even Baron Ruggish have far less interest in the Voynich Manuscript itself than in what they think the Voynich Manuscript can do for their marvellous Middle England academic career vectors. In that respect, I think they both come across as just as shallow, despicable, meaningless, and indeed sickeningly modern as that most Freemium of casual games: Goat Evolution.

I kid you not.

The 408-symbol-long Zodiac Killer cipher (‘Z408’) was cracked by Donald and Bettye Harden in 1969 while the next 340-symbol-long Zodiac Killer cipher (‘Z340’) arrived not long after: ever since then, there has been a widespread presumption among researchers that the later cipher would just be a more complicated version of the earlier cipher (e.g. perhaps transposed in some way).

The Z340 certainly resembles Z408, insofar as the cipher shapes employed in both were very similar, and that certainly lends support to the widely-held presumption that Z340 uses the same kind of ‘pure’ homophonic cipher system. But is that the whole story? Personally, I’m not so sure…

Unusual Aspects of the Z340

It has long been pointed out that the Z340 cipher sports a number of idiosyncratic features that are not present in the earlier Z408 cipher. For example, the FBI’s Dan Olson pointed out a few years ago that:

* Statistical tests indicate a higher level of randomness by row, than by column. This indicates that the cipher is written horizontally and rules out any transposition patterns that are not strictly horizontal.

* Lines 1-3 and 11-13 contain a distinct higher level of randomness than lines 4-6 and 14-16. This appears to be intentional and indicates that lines 1-3 and 11-13 contain valid ciphertext whereas lines 4-6 and 14-16 may be fake.

* Because of the vertical symmetry of the statistical observations, the message may have been written, then split into two equal size parts and placed top over bottom.

These suggest that something odd might be going on though inside the cipher: in this respect, the Z340 cipher resembles the Voynich Manuscript’s frustrating ‘Voynichese’, which looks straightforward on the surface but which turns out to have many behavioural features which are not seen in other known ciphers.

I’d also add that row #10 starts and ends with ‘-‘, which looks somewhat artificial – though it could just be random, it may also have some kind of meta-significance for the interpretation of the overall cryptogram (e.g. “CUT HERE”).

Finally, I’d add that Z340’s final (20th) line looks very much as if it contains a mangled ZODIAK signature, which – if correct – would probably make sense as 50% crypto padding, and 50% flipping the bird at the FBI. 😉

Anyway, given that the message contains 20 lines of 17 symbols (20 x 17 = 340) and we can see similar artefacts in rows 1-3 and 11-13, then it seems likely to me that there was some kind of major coding break after row #10.

Consequently, I’ve long wondered whether the two halves of Z340 (let’s call them ‘Z170-A’ and ‘Z170-B’) used a different set of cipher-symbol-to-plaintext-letter assignments to each other: in which case, the sensible way to make progress would be to try to solve each half separately. Even so, we would still need to eke out some additional assistance (or meta-assistance) from the texts to make progress, because the odds are so heavily stacked against us.

Yet there’s another feature of the Z340 cipher which struck me a while back but which I haven’t got round to blogging about until now. It’s all to do with doubled shapes, and the story starts with the Z408 ciphertext…

Z408’s doubled letters

To construct Z408, the Zodiac Killer used 7 shapes for { E }, 4 shapes each for { T, A, O, I, N, S }, 3 shapes each for { R, L }, 2 shapes each for { D, F, H }, and 1 shape each for the rest (probably): this yields a grand total of 54-ish cipher shapes to encipher 26 plaintext letters.

Given that the instance count curve for the English alphabet is often described as “ETAOINSHRDLU…”, this tiered arrangement makes sense (as I recall, various researchers have tried to use the homophone allocation to infer which popular cryptography manual the Zodiac Killer specifically relied upon, but I don’t remember if there was a definitive answer to that question).

However, one particular letter caused him a lot of practical problems for Z408: the letter L. Even though this has a relatively small frequency count (compared to, say, the letter E), the particular text he enciphered included numerous ‘LL’ pairs. That is kind of what you get if you want to say the word ‘KILL’ all the time: the words with a double-L are KILLING, KILLING, ALL, KILL, THRILLING, WILL, ALL, KILLED, WILL, WILL, WILL, and COLLECTING.

(As an aside, I’ve often wondered whether the multiple repetitions of the word “WILL” might possibly imply that the Zodiac Killer’s first name was indeed “WILL” / William. The subconscious is a funny ‘wild animal’ in that way.)

Anyway, as a direct result of this, the letter L is used here more often than its normal English stats would suggest: and so the Zodiac Killer had to encipher ‘LL’ 12 times with only three shapes in its tier. To avoid pattern repetitions, he ended up doubling up the enciphered L-shapes a few times, and so the final Z408 ciphertext included a number of doubled L shapes.

The only other doubled letter was ‘G’, which only had a single shape allocated to it, and which appeared doubled only once.

Z340’s Problem With ‘+’

If Z340 (which uses 63 distinct shapes) uses a similar kind of homophonic cipher to the one used in the Z408 cipher (which uses 54 distinct shapes), then I would say it has a very specific problem with whatever is being enciphered by the shape ‘+’.

‘+’ occurs 24 times (7% of the total number of characters, and exactly double that of ‘B’, the second most frequent shape), which by itself largely makes a nonsense of the suggestion that Z340 is a homophonic cipher: anything with that high a frequency count should surely have a whole set of homophones to represent it.

You might wonder whther ‘+’ enciphers a frequent word or syllable, such as ‘THE’ or ‘ING’. However, it appears three times immediately doubled with itself, i.e. ‘++’ (the only other letter that appears doubled is the sequence ‘pp’ that occurs once near the start of row #4).

Even if, as Dave Oranchak did, you do a brute force search for homophone cycles (don’t get me started on what they are, or we’ll be here all night), you don’t find anything that accounts for Z340’s ‘+’ shape.

And yet, as Dave Oranchak points out, Z340 has some strong-looking homophone cycles, such as [l*M] [l*M] [l*M] lM [l*M] [l*M] [l*M], which would seem to imply that Z340 is at heart a homophonic cipher. There are plenty of other measures (many noted by my late friend Glen Claston) that point in the same direction,

Moreover, because the number of shapes used is greater than for the Z408 cipher, you would naturally expect to see more tiers or wider tiers (though 7 shapes for E was already quite a wide tier). So you would naturally expect to see a consequent flattening of the statistics. And yet ‘+’ bucks that trend completely.

How Can We Reconcile These Two?

As a starting point, you might note that ‘M+’ occurs three times in the top half, but not at all in the bottom half. In fact, M is always followed by + in the top half, and never followed by + in the bottom half (where it occurs four times).

It seems to me that the ‘+’ shape makes the top half (Z170-A) easy and the bottom half (Z-170-B) difficult all at the same time. And that’s not something that I personally can comfortably reconcile with the kind of one-size-fits-all pure homophonic solutions most people seem to be looking for, even with confounding transposition stages thrown in: the behaviour of the ‘M+’ pattern would seem to point away from almost all of the transposition variants previously proposed.

Having really, really thought about it, my tentative conclusion is that ‘+’ seems to operate more as a kind of meta-token rather than as a pure token. I mean this in the same general way that certain Voynichese letters seem to me to encipher 15th century shorthand tokens (‘contractio’, etc).

A Suggestion

As I recall, the Hardens found their crib by guessing that the first letter was “I” and then looking for the word “KILL”: the ease of which doubtless made an already angry psychopath even angrier than he already was.

Hence to my mind, the thing he would most likely have been looking to solve when moving from his Z408 cipher system to his Z340 cipher system was how to make that new system impervious to that specific kind of an attack. And the key letter that let him down first time round was the letter ‘L’, specifically in its doubled form.

Consequently, I propose that this was the single technical challenge that spurred the internal changes from Z408 to Z340. And there was one obvious – but admittedly very old-fashioned – trick that he could have used to make doubled letters harder to see.

So here’s my suggestion. Could it be that the Zodiac Killer used ‘+’ as a meta-token to mean “REPEAT THE LAST LETTER“? (‘++’ would then mean a tripled letter, or perhaps something else entirely).

If that’s correct, I would further expect that ‘M’ was one of the homophones for ‘L’, and the [l*M] cycle could very well have been the 3-long homophone loop for ‘L’.

Do you really believe that the Zodiac could write a taunting message to the police without using the four-letter sequence “KILL”? In many ways, that was kind of the whole point.

I mentioned in a previous post that I thought that the Scorpion S5 cipher’s numerous shape families might offer a backdoor into its cipher system, if they just happened to be elegantly arranged on downward diagonals. I pointed out that if this were correct, the “dice” shape family that appears in columns 1, 3, 4, 8 (twice), 9, 12, 14, 15 would be most likely to have been arranged such that A was 1, C was 3, D was 4 (and so forth).

However, I didn’t actually get so far as calculating the precise probabilities in that post: but now I have (I think).

In my Scorpion spreadsheet, the total probability that a specific family was enciphered as a specific sequential set of letters is calculated as the product of each individual letter’s likelihood. By ‘likelihood’ here, I mean not the probability of that letter occurring randomly (i.e. P, its raw instance probability), but the chances of that occurring exactly N times within a column of letters of height H. And in Excel, you calculate this function using the in-built function ‘BINOMDIST(N, H, P, false)‘. (Note that instead using ‘BINOMDIST(N, H, P, true)’ would calculate the cumulative likelihood of that happening, i.e. the chances of that probability P event happening 0 times up to N times out of a maximum of H times.)

For the raw instance probability values, I used the Scorpion encipherer’s plaintext as a reasonable approximation of the text we are likely to find encrypted inside the S5 cipher. I think there’s a pretty good chance that it will be good enough.

As for the height H: once you have rearranged the message according to the 16 apparent columns of the ciphertext, columns 1 to 4 contain 12 instances each, while for columns 5 to 16, each on contains 11 instances. All of which means that the binomial probability table for N out of 11 looks like this:

binomial-probabilities-11

For example, even though the raw instance probability for ‘E’ is 11.35%, the chance that a given 11-high column of letters will contain exactly one ‘E’ is 37.4271% (or so my spreadsheet says, anyway).

But rather than limit the calculation only to length-16 families, I added a trick whereby shorter families can be checked against other diagonals in the cipher table. If you use the number 99 as the count for an individual family’s column, the spreadsheet works around it in the calculation, by allowing the shifted alphabet to start not at ‘A’ but at ‘z’ (i.e. ‘A – 1’).

I’ve included 11 shape families from the S5 cipher: if you copy a row from any one of these across to row #33, the spreadsheet will calculate a composite ranking value for each of 28 different offsets in column U (the ‘Result’ column). This is equal to the final probability times a million (or else the numbers would be too small to be practical).

For example, the relative rankings for the dice family are:-

2.737265 A
0.013655 B
0.000000 C
0.000046 D
0.293415 E
0.018483 F
0.093272 G
0.000451 H
0.000078 I
0.000000 J
0.009360 K
0.074230 L

Here, the ranking for ‘A’ (2.7372765) is nearly 10x the ranking for second placed ‘E’ (0.293415), which is essentially what my initial imprecise guess was (thank goodness). 🙂

It’ll take a while to figure out what this all means, but I thought I’d post the basic spreadsheet sooner rather than later. 🙂

This is a story about three men, two of them alive and the other long dead: and, as Steve Martin famously said at the start of L.A. Story (1991), “I swear, it’s all true“…

The Somerton Man

Mysteriously, our first protagonist was found dead on Somerton Beach near Adelaide on 1st December 1948: his identity, despite the passing of several decades since, has still not been determined. Yet it recently turned out [*] that this ‘Somerton Man’ was known by at least one person – a nurse who once signed herself “Jestyn”, but whose real name was Jessica Thomson (neé Harkness), and whose Adelaide phone number was written on the back page of a book later connected to the man, though she never disclosed his identity to anyone (if indeed she ever knew it).

As far as evidence goes, the cold case associated with this man has heaps of (for want of a better word) “micro-clues”: and we really should be able, with all our modern databases, computers, and crowdsourced collaborationware, to identify him without much difficulty. Yet apart from the fact that he was a fit-looking guy not much older than forty with an enlarged spleen, we don’t know (a) who he was; (b) where he was coming from; (c) where he was going to; (d) what he was doing; or even (e) what killed him, let alone anything so fancy as (f) why.

All of which is defensive researcher-speak for we know diddly-squat of importance about him: the truth is we haven’t even got started.

As a result of all this, what can only be termed wretchedly hopeful theorieswas he romantically connected with the nurse? was he an American spy? a Soviet spy? a uranium prospector? a car thief? a black marketeer? a Third Officer on a merchant ship? etc etc – hover over his long-dead corpse like flies above dung.

But the thing he now most resembles is a blank Sudoku grid – a puzzle which has at least as many answers as people scatching their heads over it. Why not insert your own pet theory (or indeed theories) into his still-basically-blank grid? Some days it seems as though every other bugger has: welcome to the world of the Somerton Man. 🙂

Derek Abbott

Professor Derek Abbott is our second main protagonist. A few days ago, a long-form piece in the California Sunday Magazine laid out his personal journey from obsessive London schoolboy to Professor of Electrical & Electronic Engineering at the University of Adelaide.

But most importantly, the piece finishes up with something that has been an open secret within the Somerton research community (as if anything so ramshackle and disparate can have so grand a title): that a few years ago Abbott married Rachel Egan, by whom he has three young children. Oh, and if you didn’t already know, Egan’s grandfather was Robin Thomson, the nurse’s son: which certainly directly links Abbott to the mystery of the Somerton Man, and quite possibly to the dead man himself.

Unfortunately, Abbott has devised a whole host of strategies to work around his well-trained stance of scientific impartiality, because he has become utterly convinced that the Somerton Man was Robin Thomson’s real father, despite having (as far as I can see) no proof of this whatsoever beyond really wanting it to be true. And so, over the last few years, Abbott has conjured up all manner of petition-backed legal motions to exhume the Somerton Man (essentially, a techy ‘fishing trip’ to extract DNA from the dead man’s teeth or bones), every one of which has been rejected.

Abbott’s latest variant on this theme – to convince American crowdfunders to back his group’s ongoing research via a £100,000 Indiegogo campaign – currently seems fairly dead in the water (having raised roughly £227 after 18 days, i.e. less than 0.25%), despite his efforts to promote it to gullible open-minded American backers, even floating the possibility of some long-winded family connection between the Somerton Man (or, to be precise, between Robin Thomson who he believes to have been the Somerton Man’s son) and Thomas Jefferson’s family.

For me, the two biggest problems with Abbott’s Indiegogo campaign are (a) that it doesn’t actually specify where the money would go, just that it would be spent on a range of things Abbott believes would best achieve the goal of identifying the Somerton Man, even though he only really has a single theory in play that he wishes to try to prove; and (b) that, given that he plans to put a fair tranche of this Phase 1 cash on building videos and lobbying to promote a putative “Phase 2” (raising even more cash and doing even more complicated tests), he hasn’t exactly been open about this.

Actually, it turns out that crowdfunders are far less gullible and, frankly, far cleverer than Abbott seems to believe them to be. They like proper details on a project page (ones they can actually check for themselves); they like plans that are specific, believable and actionable; and they like to back people who are taking on difficult things that benefit everybody, not just themselves. Abbott clearly believes that he has ticked all of these boxes: I don’t think he has.

Of course, it’s down to individual crowdfunders where they put their money, and Abbott might yet get stumble into a nest of random accidental energy billionnaires who end up throwing a wodge of Monopoly oligarch money in his direction. All I can say is that as far as codes and ciphers go (this is, after all, Cipher Mysteries), all Abbott and his students have managed to do in eight years is essentially what Aussies super-codebreaker Eric Nave did in one day in 1949 (and without computers to help him). Hence I wouldn’t expect them to make any progress with the specifically cipher mystery side of this story any time soon.

Feltus

The California Sunday magazine piece also lays out Abbott’s bitter ongoing rivalry with former South Australian detective Gerry Feltus. Feltus, who retired back in 2004, considers Abbott a pest, and – I’m sure it’s there between the lines somewhere, but please correct me if I’m wrong – an annoying prick with it. Furthermore, though Gerry has never said such a thing to me, I’d be unsurprised if the phrase completion “…and Costello” looms large in his mind whenever he hears the Professor’s surname. Let’s face it, the Aussies really are masters of sledging, so Abbott’s surely bound to come out wet in any pissing contest.

The key difference between these two men’s appraches is plain to see. While Abbott knows exactly what family history he wants to prove and is willing to spend £100K of other people’s money (in Phase 1, and probably double that in the Phase 2 lined up in his mind) to do it, Gerry Feltus is the opposite: patient, meticulous, careful, and seemingly immune to theories. He thrives on the fuzz of doubt: and what he says and writes is all the better for it.

You also don’t have to look very deeply to contrast Abbott’s attempts to embrace the wonders of crowdfunding and Internet self-promotion with Feltus’s dislike for the Internet’s noisy troll-yappery. In many ways, Feltus’ book The Unknown Man is the epitome of doubt, care and patience: the two men may be united by the Somerton Man, but in every other aspect they really are chalk and cheese.

Yet in a way, this kind of starkly opposite pair of trenches isn’t a helpful part of their discourse: in my opinion, pure credulity and pure doubt are both inadequate methodologies for tackling something as historically complex as the Somerton Man.

And so it is for me that even though Abbott often comes across as though he is a scientist doing bad history, Feltus is still thinking too much like a detective, and not enough like an historian – and there’s a big difference.

For sure, Feltus’s overall approach is hugely better than Abbott’s: but – in my opinion – what differentiates the best historians is a driven willingness to choose just the right kind of a limb to go out on to help them find the key evidence they need, and I’m not sure Gerry – who I like, if you hadn’t worked that out by now – has yet developed that ability. (Abbott thinks he has, but he plainly hasn’t.)

The Lessons Of History

Oddly, the cipher mystery world has seen something similar to all this before, insofar as Abbott is trying to raise funding for what constitutes a full-frontal attack on the Somerton Man mystery. Argably the closest parallel is Colonel Fabyan’s Riverbank Labs from a century ago, that famously brought William Friedman and Elizebeth Friedman together. Yet the central point of what Fabyan was doing was to try to prove something that he firmly believed was an a priori truth: that the real genius behind all William Shakespeare’s fine words was none other than Francis Bacon.

Despite the fact that the whole exercise yielded good incidental results (though I would expect that the Friedman’s would have met and perhaps even married through Govermental crypto channels), Fabyan’s attempt to prove Bacon’s authorship was still a foolish thing to be trying to do.

Perhaps Abbott’s efforts will incidentally / accidentally yield secondary long-term benefits: it’s always possible. But it doesn’t mean that I don’t think he’s ultimately doing just as foolish-minded a thing as Fabyan was doing, back a century ago.

[*According to her family in a recent TV documentary*]

Any talk that starts with “I hope you’re all here because you like history and you like numbers, because we’re going to do a lot of math later on…” has a vast amount to commend it already (in my books, at least).

And Anja Drephal’s talk on pen-and-paper crypto hacking from Stalinist-era Russia just keeps on getting better (she was doing her doctorate on metahistory in Vienna at the time of the 30th Chaos Communication Congress in Hamburg in December 2013, where she made this presentation).

Richard Sorge, the subject of the talk, belonged to the same pre-WW2 Soviet ‘out-reach’ spy wave that carried ‘Otto’ to London (who recruited Kim Philby, etc) and so forth. To get messages back to Vladivostok from Japan, the Russian military devised a pen-and-paper additive book cipher for him – he chose a German statistical yearbook for 1935 as his book, and away they went.

Sorge and his group was captured in 1941, and put in jail: he was sentenced to death (mostly in the hope that the Russians would swap him for a Japanese spy in a Russian jail), but because the Soviets denied all knowledge of him, he was executed before the end of the war.

Most of the Soviet and GDR historiography about Sorge later painted a heroic picture of him: while most of the West German historiography focuses instead on his drunkenness, his numerous affairs, his illnesses and so forth. As always, the truth lies somewhere in the middle: but perhaps that’s the nature of the spy ‘trade’, to fall into every historical crack.

Anyone hoping to find insights into other well-known unbroken ciphers (I’m thinking in particular of the Somerton Man’s Rubaiyat cryptogram) will doubtless come away dissatisfied: but it’s not really that kind of a thing. I liked it anyway. 🙂

I’ve been thinking a little more about how to go about cracking Scorpion Cipher S5.

I mentioned before that I thought that the encipherer might well have started from an elegant-looking 26×16 grid filled with diagonally-downward families of shapes, and that this arrangement might offer codebreakers some additional kind of “spatial logic” to support their efforts that traditional ciphers don’t usually provide.

From the letters that accompanied the ciphertexts, my inference is that the Scorpion is like a smart 12-year-old who has just ‘got’ the elegance of maths: but this leads me to a secondary inference that he/she probably didn’t understand modulo addition, because if he/she did, then we would surely have seen more 16-element shape families in the text.

I’ll explain with the help of a diagram of the kind of 26×16 grid I’m talking about:

scorpion-cipher-26x16-grid

If the encipherer had laid out his/her grid with modulo-26 maths in mind, then 16-element families that start in the orange (top right) area and step diagonally down and to the right (as I predict) should wrap around (modulo 26) to the yellow (bottom left) area. However, I believe that we don’t see nearly enough length-16 shape families to support that grid-filling model.

What I think actually happened was that the encipherer only started length-16 families in the A-K range for alphabet #1, which would have ended on P-Z for alphabet #16. This means, for example, that because the ‘dice’ family (actually, the ‘dots in a square’ family, to be precise) has members in alphabets 1, 3, 4, 8, 9, 12, 14, and 15, we may well be able to directly infer that its very first member (in alphabet #1) is A-L.

Moreover, given that the lowest frequency letters in the encipherer’s accompanying letters are…

k : 0.4%
x : 0.3%
j : 0.1%
z : 0.0%
q : 0.0%

…we may also be able to make a reasonable guess as to which possibilities of A-L are the least likely. For example, because the dice family appears in columns 1/3/4/8/9/12/14/15 (of the 16-column sequence I discussed before), this would map to:

+0 : ACDHILNO --- OK
+1 : BDEIJMOP --- has J, so fairly unlikely
+2 : CEFJKNPQ --- has J, K and Q, so not likely at all
+3 : DFGKLOQR --- has K and Q, so not likely
+4 : EGHLMPRS --- OK
+5 : FHIMNQST --- has Q, so not likely
+6 : GIJNORTU --- has J, so fairly unlikely
+7 : HJKOPSUV --- has J and K, so not likely
+8 : IKLPQTVW --- has K and Q, so not likely
+9 : JLMQRUWX --- has J, Q and X, so not likely at all
+10: KMNRSVXY --- has K and X, so not likely
+11: LNOSTWYZ --- has Z, so not likely

So in fact, I suspect that we already know enough to guess that the dice family members encipher either ACDHILNO or EGHLMPRS (in sequence), which I think isn’t a bad starting point at all.

Finally, I suspect there’s something of a cryptological paradox in play here: the more alphabets are involved, the more spatial structure we have to work with. Hence S5’s 16 alphabets might well make it surprisingly crackable. 🙂

A few weeks ago, an occasional email correspondent proposed in some depth that the Beale Ciphers were some kind of Masonic cipher, as Joe Nickell had famously claimed many years earlier.

One of the grounds my correspondent cited was that because Robert Morris’s (~1860) “Written Mnemonics” employed (what he, though not a cryptologist himself, thought was surely) a largely similar dictionary cipher, then it was surely no great stretch at all to see the Beale Ciphers also as a Masonic cipher, right?

I’d seen “Written Mnemonics” mentioned in a number of places (most notably in Klaus Schmeh’s online list of encrypted books), but had never seen it up close and personal, even though it was quite a well-known historical cryptogram. So I bought a copy to see it properly for myself. And, as Barry Norman was (and probably still occasionally is?) wont to say, why not?

written-mnemonics-cover

Maybe one day I’ll also get round to buying myself a copy of the Oddfellows cryptogram booklet I cracked too. But my cipher book-buying account is none too flush right now, having just bought four Beale-related books this month. 🙂

Anyway, I posted a permanent webpage here for “Written Mnemonics” with some scans of its first few pages: but it seems highly unlikely to me that anyone would be able to crack it without the (separately published) cipher key document, of which I don’t currently have a copy. (Of course, if anyone happens to know how I can get a copy of that, please let me know!)

The historical background is that the book’s author, Robert Morris (no relation to the “Robert Morriss” mentioned in the Beale Papers, sorry if that’s inconvenient), produced these “Written Mnemonics” to try to preserve and distribute what he believed (from his own historical research) to be the oldest genuine forms of Masonic rites. Though this went against the letter of Masonic practice, he and a group of like-minded people known as the “Masonic Conservators” felt that the historical urge to conserve these rituals in written (albeit strongly enciphered) form outweighed the letter of the Law that said not to record them.

However, this was a controversial thing for him to do because when you signed up to be a Mason, you specifically swore never to write Masonic rituals down – they were necessarily supposed to be passed down orally, as part of an (allegedly) millennia-spanning tradition of passing secrets down orally (though whether this supposition is actually true or not is another matter entirely).

And so Morris’ publication in the 1860s of a 3000-copy print run of his “Written Mnemonics” book proved problematic for many Masons, particularly those of a more conservative disposition (of which there were more than a few). Unfortunately, there wasn’t really a middle ground to be had in the ensuing debate: and ultimately Morris came off the worse of most of the associated arguments, and so ended up being pushed to the movement’s periphery, if not the cold outside.

History hasn’t really remembered Morris well, but perhaps this is a little unfair: and this may also have been because Ray Vaughn Denslow’s (1931) book The Masonic Conservators covered the ground of what happened so well that there was little else of great interest for later historians to scratch through.

sons-of-the-desert

Might the Beale Ciphers be Masonic? Well, it’s entirely true that a fair few men of that era were Masons or Oddfellows or Sons of the Desert (or whatever), and so there was a reasonable statistical chance that the person who enciphered the Beale Ciphers was at least coincidentally a Mason: hence I can’t currently prove that the Beale Ciphers were not some kind of smartypants Masonic cipher of a previously unknown form.

But having gone over Denslow’s descriptions of Morris’s cipher key (which Denslow clearly had seen one or more copies of), I can say that there is clearly no connection whatsoever between the kind of code used by Morris and the kind of dictionary cipher used in B2, or indeed the (very probably) hybridized dictionary cipher used in B1 and B3.

So might the Beale Ciphers have anything at all to do with Morris’ “Written Mnemonics”? From what I can see so far, the answer is an emphatic no, sorry. As always, please feel free to point me towards other documents or evidence that suggests otherwise. 🙂

On p.114 of Jerrold Northrop Moore’s weighty “Edward Elgar: A Creative Life”, the author notes that Elgar’s enciphered “Liszt fragment” had been decoded (in 1977, according to Anthony Thorley whose decryption it was) to read:

Gets you to joy, and hysterious

Well… it’s certainly a claim, even if ‘hysterious’ is a made-up word found nowhere else. And one of the (cryptologically, at least) interesting aspects that link this Liszt fragment and Elgar’s Dorabella Cipher is that while both of them seem unlikely to have employed complicated cipher systems, for all of that both also seem improbably hard nuts to crack. You’d certainly need a sweet nutcracker to achieve it. [*]

I’ve discussed Elgar’s Liszt fragment before, written in the left margin of an 1885/1886 Crystal Palace Saturday Concert Programme:

liszt-fragment

The cipher on its own looks like this (sorry, but I don’t have a better scan):-

liszt-fragment-solo

It’s not a great scan, certainly: but given that though the dash looks as though it is meant to sit at the end, and there are several half-space-sized gaps, it looks as though we might be able to transliterate this as:

ABC DECFGB HID CBJKDK

What should be immediately apparent is that there is no obvious way to convert this 3 + 6 + 3 + 6 = 18 letter cryptogram into Thorley’s 25-letter “Gets you to joy, and hysterious”, without a singularly large floor space for mental acrobatics to bounce around on. (If that’s what you want to do, feel free to go ahead.)

And yet, what we undeniably have with the Liszt fragment that we don’t seem to have with the (much later) Dorabella Cipher is context, specifically a musical context. And here I can’t help but notice not only that the Liszt ciphertext seems to have been written in sets of three or sets of six, but also that the music it sits besides also has a very strong emphasis on triplets, groups of three notes.

Moreover, the 18-letter group is written immediately beside an 18-note line of music, “No. 6 Allegretto Pastorale”. Might the first be enciphering the other in some way?

a...b...c...d...e...c...f...g...b...h...i...d...c...b...j...k...d...k...
B...G#..E...B...G#..E...B...B...E...F#..G#..C#..B...G#..F#..E...F#..G#..

I can’t see any obvious cryptographic connection myself here, but I was somewhat surprised to find that nobody had apparently suggested this at least as a reasonable possibility for the Liszt fragment, far more so than for the Dorabella Cipher. (Plenty of people [e.g. Javier Atance, etc etc] have suggested that the Dorabella Cipher might be enciphering music, but that’s another story entirely).

Something to think about, anyway. 🙂

[*] Made me laugh, anyway. 🙂

I know, I know, it somehow turned into ‘Beale Cipher Week’ here at Cipher Mysteries without so much as a tiny red flag on my part by way of warning. Having said that, I’m just as surprised as you probably are, and yes, I do have plenty of non-Beale cipher stuff to cover: but stick with where this is all heading, I think it’s actually quite interesting.

While waiting for my Thomas Beale Junior-related books to arrive (yes, the virtual BealeFest Will Continue), I had a quick look around the web to see if anything else Bealesque was going on. Apart from an online comic-book retelling of the Beale pamphlet / myth, all I found of interest was Reddit user called HughJorgens (fnarr fnarr) asserting that, contrary to what Ward’s pamphlet claims, you don’t get gold and silver mines in the same place.

Might he be right? I didn’t know: but given that the pamphlet specifically claims that Beale’s group found gold and silver north of Santa Fe, I thought it would be useful to briefly review gold and silver mining, and also the specific history of gold and silver mining in Colorado and New Mexico.

Lode vs Vein vs Placer vs Bench

To make sense of what’s going on here, you have to know some gold prospector terminology: so here’s a brief guide.

Gold starts in lodes (rich, clumpy underground deposits, typically in hard rock that needs mining out): but when a river cuts its way through an underground deposit, it breaks fragments of gold away from the lode, and washes them away. If they sink into fissures in the rock, they form underground gold veins (spread out in long thin deposits), or else they get carried away into stream beds containing sand, gravel or earth, known as placers. (Because gold is roughly six or so times more dense than most other materials normally found in a placer, it tends to move more slowly, to fall and then to get wedged in cracks in the bottom of the stream bed.)

Finally… when a stream or river changes its course over the course of time, an old stream bed can be left (quite literally) ‘high and dry’. This is known as a ‘bench’: and the gold found in a bench is a bench deposit. But if you’re looking at bench deposits, you need to dry wash what you dig from the bench, a process that was invented in 1897 by Thomas Edison – before that, prospectors had to bring lots of water with them to wet wash what they had dug out.

So: to successfully pan for gold, you need to be standing on a placer deposit, though ambitious gold prospectors sometimes try to trace a stream-bed back to the lode or veins where the gold in the placer was originally washed away from.

Silver prospecting, however, works quite differently from all this, because silver almost never appears as a placer deposit or bench deposit in the way that gold is, but instead usually appears as ore that needs mining. Moreover, silver ore is also frequently found with various other commercially valuable ores – copper, lead, tin – so offers much more of a conventional mining ‘win’ than gold. Hence a ‘gold rush’ can be a short, sharp shock to a local economy that then quickly disappears, whereas a ‘silver’ rush takes much longer to work through, often decades.

Colorado

In 1859, the first gold in Colorado was found in placer deposits in Cherry Creek, near where it meets the South Platte River. This triggered a Colorado gold rush (known as the Pike’s Peak Gold Rush simply because it was a well-known local feature, not because the gold was anywhere near it), which was followed by vein discoveries in a number of locations.

The earliest guide to the history of Colorado Gold was published in 1859 by Le Roy Reuben Hafen. His “The Illustrated Miners’ Hand-book and Guide to Pike’s Peak: With a New and Reliable Map, Showing All the Routes and the Gold Regions of Western Kansas and Nebraska” (available online here, though sadly without a scan of his “New and Reliable Map”) include some interesting (if somewhat breathless and unreliable-sounding) stories, such as this one:

In 1835, a French Trapper by the name of Eustace Carriere, while lost from his party, wandered through that region for several weeks, during which time he collected some fine specimiens, which he found lying upon the surface, and took them with him to New Mexico. Upon examination, they proved to be pure gold, and a company, with M. Carriere as their guide, soon set out for the new Eldorado. Arriving there, their guide failed to find the precise location where he had seen so much of the “sparkling mineral,” and the Mexicans, under the supposition that he did not wish to disclose to them his new discovery, inflicted upon him a severe whipping, left him, and returned to New Mexico. (p.7)

Not long after gold was ‘properly’ discovered in Colorado in 1859, silver veins too were found in the Central City-Idaho Springs district. Interestingly, “[the] veins of the district are zoned in a roughly concentric manner, with gold-bearing pyrite veins in the center, and silver-bearing galena veins more common in the outlying areas.”

New Mexico

In New Mexico, gold was first discovered in 1828, several decades earlier than in Colorado. Placer deposits were found first (in the Ortiz mountains in Santa Fe County), followed by a lode deposit close by, five years later (in 1833), which was still 13 years before it was incorporated into the United States. Yet New Mexico’s ultra-dry climate made it a difficult place to prospect for gold, particularly in the years before drywashers became available.

Fayette Jones’ (1905) book New Mexico Mines and Minerals talks about stories of old mines in the area, but concludes:

The evidence seems conclusive that no mines of either silver or gold were worked to any extent prior to 1800; save some little gold picked from the gravels at various points throughout the Territory and from the silver lead mines in the vicinity of Los Cerrillos […] (pp.11-12)

Another choice quotation (from what I found a very interesting book) concerned the gold productivity of the whole New Mexico territory at this time:

According to Prince’s History of New Mexico, between $60,000 and $80,000 in gold was taken out annually between the years 1832 and 1835. The poorest years of this period were from $30,000 to $40,000. (p.22)

Thomas Beale’s Gold and Silver?

In Ward’s 1885 pamphlet, the author writes that when Beale’s party found gold “in a small ravine […] in a cleft of the rocks” in April or May of 1818, it was “some 250 or 300 miles to the north of Santa Fe”, i.e. in the middle of modern-day Colorado. “[The] work progressed favorable for eighteen months or more, and a great deal of gold had accumulated in my hands as well as silver, which had likewise been found.”

Yet it looks as though HughJorgens’ (sigh) claim that gold and silver don’t occur together isn’t entirely true, in that both were indeed found close by each other in Central City-Idaho Springs (though only in 1859, and in veins rather than in lodes). And in fact Idaho Springs is very nearly directly north of Santa Fe (the two are about 340 miles apart).

But what bothers me here is the sheer scale of the operation, particularly the silver. The problem with mining silver is that that’s the easy part: it then has to be smelted in order to be commercially useful. But without a silver industry already in place, there wouldn’t be any kind of silver smelting infrastructure in place. Colorado may have become known for a while in the late 19th century as the Silver State, but circa 1820, this was all a long way in the future.

According to the B2 ciphertext:

“The first deposit consisted of one thousand and fourteen pounds of gold, and three thousand eight hundred and twelve pounds of silver, deposited November, 1819. The second was made December, 1821, and consisted of nineteen hundred and seven pounds of gold, and twelve hundred and eighty-eight pounds of silver; also jewels, obtained in St. Louis in exchange for silver to save transportation, and valued at $13,000.”

I don’t know: there are a lot of details here that I can’t quite swallow all at the same time. If it’s right, Beale and his team not only found gold (in vastly greater quantity than Eustace Carriere claimed to have done in 1835), but they also found silver, which is an extremely rare combination. They also seem to have found lode deposits near the surface: the description doesn’t sound as though they were doing anything so gauche as panning at placer deposits. And they also seem to have exchanged a large amount of silver (presumably silver ore?) in St Louis without causing any ripples of suspicion there.

Whether or not you buy into the whole ‘thirty gentlemen adventurers’ story is up to you: but there’s something about all this gold and silver business that mechanically and physically doesn’t ring true to me. It’s a thousand-plus miles to St Louis from Santa Fe: that’s a long, long way to haul that stuff, really it is. 🙁