In many ways, Beale Cipher B1 is a lot like the Zodiac Killer’s Z340 cipher, insofar as they both have what seem to be direct predecessor homophonic ciphertexts (B2 and Z408) that are very publicly solved: yet we seem unable to exploit both ciphertexts’ apparent similarities in both system and presentation to their respective parent.
At the same time, it’s easy to list plenty of good reasons why Beale Cipher B1 has proved hard to crack (even relative to B2), e.g. its very large proportion of homophones, the high likelihood of transcription errors, etc. Combining just these two would seem to be enough to push B1 out of the reach of current automatic homophone crackers, even (sorry Jarlve) the very capable AZdecrypt.
But in many ways, that’s the easy side of the whole challenge: arguably the difficult side is working out why B1’s ciphertext is so darned improbable. This is what I’ve been scratching my head about for the last few months.
Incremental Series
I posted a few days ago about the incremental sequences in B1 and B3 pointed out by Jarlve, i.e. where the index values increased (or indeed decrease) in runs. Jarlve calculated the (im)probability of this in B1 as 4.61 sigma (pretty unlikely), B2 = 2.72 sigma (unlikely, but not crazily so) and B3 = 9.86 sigma (hugely unlikely).
Why should this be the case? On the one hand, I can broadly imagine the scenario loosely described by Jim Gillogly where an encipherer is pulling random index values from the same table of homophones used to construct B2, but where the randomness sometimes degenerates into sweeping across or down the table (depending on which way round it was written out), and that this might (somehow) translate into a broadly positive incrementality (in the case of B1).
But this kind of asks more questions than it asks, unfortunately.
Gillogly / Hammer Sequences
Surely anyone who has read more than just the mere surface details of the Beale Ciphers will know of the mysterious Gillogly strings in Beale Cipher B1 (that were in fact discussed at length by both him and Carl Hammer).
On the one hand, finding strings in broadly alphabetic sequence within the resulting plaintext (if you apply B2’s codebook to B1’s index numbers) would seem to be a very improbable occurrence.
And yet the direct corollary of this is that the amount of information stored in those alphabetic sequences is very small indeed: indeed, it’s close to zero.
One possible explanation is that those alphabet sequences are nothing more than nulls: and in fact this essentially the starting point for Gillogly’s dissenting opinion, i.e. that the whole B1 ciphertext is a great big null / hoax.
Alternatively, I’ve previously speculated that we might be looking here at some kind of keyword ‘peeking’ through the layers of crypto, i.e. where “abcdefghiijklmmnohp” would effectively be flagging us the keyword used to reorder the base alphabet. For all that, B1 would still be no more than a “pure” homophonic cipher, DoI notwithstanding. As a sidenote, I’ve tried a number of experiments to use parts (e.g. reliable parts, and only some letters) of the B2 codebook to ‘reduce’ the number of homophones used by the B1 ciphertext to try to finesse it to within reach of AZdecrypt-style automatic cracking, but with no luck so far. Just so you know!
I’ve also wondered recently whether the abcd part might simply be a distraction, while the homophone index of each letter (e.g. 1st A -> 1, 2nd A -> 2, 3rd A -> 3, etc) might instead be where the actual cipher information is. This led me to today’s last piece of improbability…
The Problem With jklm…
Here’a final thing about the famous alphabetic Gillogly string that’s more than a bit odd. If you take…
- the B1 index (first column)
- map it to the slightly adjusted DOI numbering used in the B2 ciphertext (second column, hence 195 -> 194)
- read off the adjusted letter from the DoI (third column, i.e. “abcdefghiijklmmnohp”)
- print out the 0-based index of that homophone (fourth column, i.e. “0” means “the first word beginning with this specific letter in the DoI”)
- and print out how many times that letter appears in the DoI
…you get the following table:
147 -> 147 -> a -> 16 /166
436 -> 436 -> b -> 12 / 48
195 -> 194 -> c -> 7 / 53
320 -> 320 -> d -> 10 / 36
37 -> 37 -> e -> 2 / 37
122 -> 122 -> f -> 1 / 64
113 -> 113 -> g -> 1 / 19
6 -> 6 -> h -> 0 / 78
140 -> 140 -> i -> 5 / 68
8 -> 8 -> i -> 1 / 68
120 -> 120 -> j -> 0 / 10
305 -> 305 -> k -> 0 / 4
42 -> 42 -> l -> 0 / 34
58 -> 58 -> m -> 0 / 28
461 -> 461 -> m -> 7 / 28
44 -> 44 -> n -> 1 / 19
106 -> 106 -> o -> 7 /144
301 -> 301 -> h -> 7 / 78 [everyone thinks this one is wrong!]
13 -> 13 -> p -> 0 / 60
What I find strange about this is not only that the “jklm” sequence is in perfect alphabetic order, but also that its letters are all the 0th instance of “jklm” in the DoI. To me, this seems improbable in quite a different way. (Perhaps Dave Oranchak and Jarlve will now both jump in to tell me there’s actually a 1 in 12 chance of this happening, and I shouldn’t get so excited.)
The reason I find this extremely interesting is that it specifically means that the jklm sequence contains essentially zero information: the B2-codebook-derived letters themselves are in a pure alphabetic sequence (and so can be perfectly predicted from letter to adjacent letter), while each letter is referred to the index of the very first word-initial occurrence in the DoI.
This means (I think) that there isn’t enough information encoded inside the jklm sequence to encipher anything at all: which I suspect may actually prove to be a very important cryptologic lemma, in terms of helping us eliminate certain classes of (or attempts at) solutions.
>there isn’t enough information encoded inside the jklm sequence to encipher anything at all
I don’t think so. There are 4 words encoded: just King laws mankind.
If we adjust B1 using the encoding errors of B2, around character 313, we have the string:
nhwtste
Notice anything odd? What about this?…
n.w.s.e
While if this were found in any random text it would be a nothing burger for sure.
But in C1, the fact that these are cardinal directions, in an order (counter-clockwise), that seems odd.
sbs*etfa*gcdottucwitwtaaisdbtidtt*wtabaabadaaabbcdeffiflkigpeamnpwchofoallpmotamanhabbbccccddeaosdsttntftatpocacbcddeletpftbthfffehuubtjtttihpaoalsatatttohnmpaaaarbopjdtt**tsbcoadafacpnrbabcdefghiijklmmnohppowtaombblsoesaofsispctaolbtflhioasgtwtenklcassaastatofbtawgfeaacoaiattwhottaaoetsafaasbstciarcabtotocldc [nhwtste] hiioatstwttsoftastaatsiwcpcwsotlinieeittdattpiufaerfabptcoaoidnattoatstg**atmatwnwttocwtotpatsotebttrphbtogcwcdrolitiahlwaascbostafaewcp*otowltoikctewtafoacwottttothrisjethacuao*plihrmsstrasnitpctilwftf