Following my recent Scorpion Ciphers post, I’ve put up a permanent reference page on the Scorpion Ciphers and have also tried to contact John Walsh about the as-yet-unreleased other ciphers… so we’ll see how that goes.
Since then, I’ve been working a little more with S5, which has 155 unique symbols out of 180 letters. Because repeated symbols in S5 are always multiples of 16 letters apart, it seems likely to me that this ciphertext was constructed from 16 independent alphabets cycled through in strict sequence. My hope was that this regularity might give us a better chance of cracking S5 than if it were a randomly chosen homophonic cipher.
All the same, this was just a guess: so the first thing I did was come up with a way to test this hypothesis, by writing a short C program to encipher 180-long subsections of the Scorpion’s own plaintext using various numbers of sequential alphabets, to see if this would produce roughly 155 unique symbols.
For each number of alphabets (e.g. 2), I tried (notionally) enciphering every 180-long stretch of the Scorpion’s text, and kept a tally of the minimum number of symbols required (e.g. 37), the maximum number of symbols required (e.g. 44), and the average number of symbols required (e.g. 40).
Interestingly, the results weren’t what I expected:-
alphabets = 1, uniques = (19..24) 21
alphabets = 2, uniques = (37..44) 40
alphabets = 3, uniques = (50..61) 55
alphabets = 4, uniques = (60..74) 68
alphabets = 5, uniques = (72..86) 79
alphabets = 6, uniques = (77..97) 87
alphabets = 7, uniques = (88..105) 97
alphabets = 8, uniques = (91..110) 101
alphabets = 9, uniques = (92..116) 106
alphabets = 10, uniques = (104..122) 113
alphabets = 11, uniques = (107..127) 117
alphabets = 12, uniques = (113..136) 122
alphabets = 13, uniques = (113..134) 123
alphabets = 14, uniques = (115..138) 129
alphabets = 15, uniques = (123..146) 132
alphabets = 16, uniques = (120..147) 133
alphabets = 17, uniques = (128..146) 136
alphabets = 18, uniques = (126..151) 137
alphabets = 19, uniques = (128..150) 139
alphabets = 20, uniques = (132..153) 143
alphabets = 21, uniques = (133..159) 144
alphabets = 22, uniques = (131..155) 145
alphabets = 23, uniques = (137..154) 145
alphabets = 24, uniques = (137..157) 147
alphabets = 25, uniques = (139..160) 149
alphabets = 26, uniques = (141..158) 149
alphabets = 27, uniques = (143..163) 152
alphabets = 28, uniques = (143..164) 152
alphabets = 29, uniques = (139..164) 153
alphabets = 30, uniques = (145..164) 154
alphabets = 31, uniques = (143..164) 153
alphabets = 32, uniques = (146..167) 156
That is to say, even though S5 looks as though it is strictly cycling through 16 ciphers, this isn’t consistent with the stats of the Scorpion’s other plaintext (because that is so verbose and repetitive that it would require on average 32 alphabets to typically yield 155 symbols).
What I think this is implying is either (a) that the Scorpion’s plaintext is significantly less repetitive than the text of his/her messages, or (b) that the cipher system the Scorpion used also employs an extra layer of compression (e.g. a nomenclatura, using extra tokens for common words such as [THE] and [AND], or even common syllable pairs).
I don’t know… I’ll have to have a further think about this, it isn’t at all obvious what’s going on here.
Update: having scratched my head about this for a few more hours, I don’t feel comfortable with the suggestion that some kind of nomenclatura is involved. Rather, what I suspect now is that what we’re looking at here is not a 16 x 26-token set of ciphers (i.e. A-Z) but a 16 x 36-token set of ciphers (i.e. A-Z plus 0-9), coupled with a slightly less verbose plaintext. Hence my very rough (and admittedly as yet unmodelled) estimate is that roughly 25-35 of the tokens in the plaintext will turn out to be digits.
Unfortunately, I also think that this may have left the text undecryptable, unless there is some additional kind of meta-consistency between shapes across the 16 alphabets (e.g. if all the circle-plus-upright-cross shapes encode the same underlying plaintext token). Oh well!


