Back in 2007, John Walsh (the host of “America’s Most Wanted”) announced that he had, since 1991, received a string of disturbing-sounding letters from an individual calling himself / herself “The Scorpion”: many of them had sections or pages that were apparently in cipher. Two of these ciphers were released to the public: these became known as “S1” and “S5”.

In the same year, Christopher Farmer (“President of OPORD Analytical”) announced that he had cracked S1 (which was apparently built on a 10×7 grid):-

scorpion3

Farmer’s claimed solution reads like this:-

baelprovid
edthemwith
newstories
butwhatifi
askjwdoiwa
xrtwbonesa
gezjefxkon

Unfortunately, all the diagrams illustrating Farmer’s ingenious reasoning have withered on the Internetty vine in the years since then (they’re not even in the Wayback Machine, nor anywhere else as far as I can see), which is a bit of a shame.

Even so, this turns out to be an entirely surmountable problem: Farmer’s claimed solution is clearly incorrect, for the simple reason that letters in the ciphertext aren’t consistent in the plaintext. For example, the cipher “K” maps to both ‘a’ and ‘g’, the “backwards-L” maps to ‘w’, ‘w’, and ‘x’, the “backwards-F” maps to both ‘u’ and ‘v’, and so on. At the same time, his claimed plaintext doesn’t really make a lot of sense (“BAEL”… really? I’m not so sure).

It seems likely to me that Farmer guessed that “PROVID” was steganographically hidden in plain sight at the end of the topmost line (and if you squint a bit, you can see why that would be), and then built the rest of his decryption attempt around this hopeful starting point. Moreover, he seems to have guessed that “O” maps to ‘o’, and “backwards-E” maps to ‘e’, which are both pretty peachy assignments. But I don’t buy any of this for a minute: there are way too many degrees of freedom in this S1 cryptogram (roughly half of the individual cipher shapes occur exactly once), and quite a few extra ones in his claimed solution too.

It’s a brave attempt, for sure: but it’s still wrong, whichever way you turn it round.

Other people have tried their hand with S1, though both AlanBenjy in 2009 and Glurk on Dave Oranchak’s site in 2010 pessimistically pointed out that 53 of S1’s 70 symbols are unique, yielding a ‘multiplicity’ a fair way beyond the range of what homophonic cryptograms can practically be solved. Hence I would tend to agree with their assessment that there’s no obvious way that we will solve S1 with what we currently have to hand: in fact, there seems no way to tell whether S1 is a real cipher or a hoax – the only repeating cipher pair is “S A” (i.e. “S Λ”), which could well have happened by pure chance.

The only other Scorpion ciphertext released to the public to date is the 180-character cryptogram known as “S5”:-

scorpion4

Once again, 155 of these 180 symbols are unique, which at first glance would seem to make S5 even less likely to be solved than S1.

But wait! In May 2007, user “Teddy” on the OPORD Analytical forum pointed out that if you transpose S5 from a 12-column arrangement to a 16-column layout, shape repeats only ever occur within a single vertical column. In fact, every single 16-way column except one (column #5) includes one or more repeated shapes.

Radically, this suggests to me that S5 was constructed in a completely different way from conventional homophonic ciphers: specifically, I think that each 16-way column of S5 may well have its own unique cipher alphabet. This would mean that S5 would need to be solved in a completely different manner to the way, say, zkdecrypto works. (I don’t believe S5 was constructed with eight columns, but I thought I ought to mention that that’s a possibility as well, however borderline). Maybe that small insight will be enough to help someone make some headway with S5, who can tell?

The huge shame here is that it may well be that the other Scorpion ciphers (which to this day have not been released) might well give us additional clues about the inner workings of both S1 and S5. Specifically, if one of the other ciphers happened to have used precisely the same 16-alphabet systemas S5, it might well give us enough raw data to crack them both.

Has anyone apart from John Walsh ever seen S2, S3, S4, and S6? Just askin’, just askin’…


Update: Looking again at S1 (while bearing in mind the way S5 seems to have been constructed), I find it hard not to notice that the distances between instance repetitions seem strongly clustered around multiples of 5 (with the only instance not fitting the pattern being the “backwards-L” on row #5):-

+60, +20, +50, +36, +24, +20, +40, +20, +40, +25, +35, +10, +25, +6, +45, +9, +6.

I suspect that this means that the encipherer probably enciphered S1 by cycling through five independent cipher alphabets (largely speaking). This wasn’t a mechanically precise encipherment (whether by accident or by design), but something close enough to one such that almost all the time he/she was no more than a single alphabet ‘off’, one way or the other.

This offers a quite different kind of constraint from normal homophonic cipher searches, and possibly even enough to crack the S1 cipher. After all, we have a fair amount of the Scorpion’s meandering plaintext to use as a statistical model to aim for… 🙂