The story of the ‘Scorpion’ letters to John Walsh, host of “America’s Most Wanted” and (more recently) “The Hunt with John Walsh”, is now reasonably well known. From 1991, Walsh received a string of threatening letters from someone signing themselves “SCORPION”, and also containing cryptograms. Since 2007, two of these cryptograms (“S1” and “S5”) have been released by the FBI: however, none has yet been solved.
The Scorpion also wrote:
I now realize with many hundreds of hours of mindracking experimentation with my complex ciphers that my first one that I sent you was comparatively simple to my second, third, fourth, and now temporarily final cryptograph system. I have been encoding useful information for your use and have done it fairly, since all of my ciphers can be decoded simply, once the limited patterns and systems are discovered.
I’ve blogged before about how the S5 cryptogram (arranged as 15 rows of 12 symbols each) only ever has repeats where the distances between symbols is a multiple of 16, suggesting that it may well be composed of 16 strictly cycling cipher alphabets. I similarly suggested that S1 appeared to have repeats largely centred around multiples of 5, though this distance was far less solid.
Here’s what S1 (the first Scorpion cryptogram) looks like:
To make some kind of organizational sense of this, I tried to follow the basic pattern laid down by the S5 ciphertext, by:
* assigning symbols to five cycling columns
* mostly resetting these at the leftmost column of ten
* assuming that the encipherer’s first cipher system usage wasn’t as disciplined as his later (far more complex) efforts.
Here, you should be able to see all the same symbols as S1 (and in the same order), but assigned to five columns, where the shapes in each column are (mostly) thematically grouped. The only exception to this rule is the mirrored ‘L’ shape, which appears both in column #2 and in column #4. My strong suspicion is that this was an enciphering slip, where a simple geometric shape appeared in two different columns’ cipher alphabets by mistake.
Is this solveable? If I’m even roughly correct about the grouping, then S1 was, like S5, almost exactly the same category of cipher for which I put forward a sequence of challenge ciphers in 2017 (and all of which remain uncracked). There, the first challenge cipher was 153 symbols long, laid out in five perfectly cycling groups. This was more than twice as long as S1, and with the added benefit that I even told you exactly what kind of cipher it is. The second challenge ciphertext was slightly shorter (118 symbols): and so forth.
Can We Crack S1?
On the one hand, the multiplicity of the Scorpion ciphertexts is very high, meaning that pure homophone solvers stand almost no chance.
On the other hand, I’m pretty sure that these aren’t pure homophonic ciphers, insofar as each group of symbols almost certainly will have at most one A shape, at most one B shape etc. We might also try searching ‘down’ from setups that assume that repeated symbols in each group are not randomly chosen, but are most likely frequently used letters, e.g. ETAOINSH. With a long enough ciphertext to work with, this would be the preferred ‘classical’ way to attack the cipher: but, alas, we only have short ciphertexts to work with here. 🙁
However, my understanding is that there has been a handful of historical examples where particular ciphertexts of this general type (i.e. based around a cycle of interleaved cipher alphabets) have been cracked by determined cryptanalysts. So I’m not yet convinced it’s impossible.
All the same, has a specifically optimized machine algorithm for cracking these ever been put forward?
“… the multiplicity of the Scorpion ciphertexts is very high, meaning that pure homophone solvers stand almost no chance.” Nick, I may have asked this before — apologies if this is repetitive — but can you point me to a couple references for (semi-?)automated solution of homophone ciphers? Google was not as useful as one might hope last time I was looking…
Maybe someone (not me) could try to apply an algorithm like this (http://www.cs.sjsu.edu/~stamp/RUA/homophonic.pdf) to the Scorpion cipher?
Karl and Thomas: the best-in-class freely available homophonic solver is Jarlve’s AZdecrypt.
* v1.11, Windows GUI: http://www.zodiackillersite.com/viewtopic.php?f=81&t=3198
* Lite version, runs on Web: http://jarlve.vdm-service.be/
As Dave Oranchak points out, to make any headway with high multiplicity homophonic ciphertexts (such as the Scorpion’s, even if they are constrained homophonic ciphertexts rather than pure homophonic ciphertexts), you’d need to use Jarlve’s 7-gram dictionaries, which only come with the Windows GUI version. I haven’t tried this out on my challenge ciphers (it felt wrong trying to crack my own challenge ciphers), so feel free to give this a go. 🙂
I forgot about your challenge ciphers (oops). Might have to revisit some time….
Hi Nick.
This is not a complicated cipher . Scorpion.
How it looks like it was written by the ” Zodiac ” itself.
Scorpion S1.
1. K.K.A.C.L.L.O.F.C.O.
2. A,V.O.-Y.Q.L.C.T.O.
3. S.A.O.T.T.(.) Z.N.L.U.
4. B.F.O.I.M.O.T.R.O.I.
5. U.N.I.I.O./O.R.L.O.
6. O.N.U.L.O.2.S.A.T.U.
7. K.A.Z.I.E.A.L.I.L.I.
Hey Nick,
S5 is indeed like your challenge ciphers! If the symbols in the Scorpion ciphers are randomly chosen then they most likely cannot be solved.
My program cannot solve your challenge ciphers. You got me thinking about a new approach. Run your ciphers through a very large corpus and save some of the best fits to later use as cribs. That needs a couple of test ciphers first. I happen to have a 771 GB corpus sitting by but remember you asking not to attack with big data so.
Jarlve: hit them with everything you’ve got, the Big Data ban period has expired. 🙂
For clarity, I banned Big Data so that people didn’t try to break my challenge ciphers as a pure data mining exercise, when the point was to stimulate people to understand the cryptology of constrained homophonic cipher texts.
But what you’re proposing is using Big Data to generate mega-cribs, which is cool. 🙂
Jarlve: incidentally, is there a way of forcing AZdecrypt to always contain a cribbed word or phrase at a certain offset, i.e. to lock a series of homophones to a fixed set of letters? Sorry if this is a faq!
Hey Nick,
I am working on making able to lock homophones to certain letters and cribs for 1.13. It is just a matter of coming up with a reasonably user-friendly solution. Will let you know when it is done.