02Dec 2013

The Blitz Ciphers and null detection algorithms…

I’ve just added a new permanent page on the mysterious Blitz Ciphers to Cipher Mysteries. Basically, I discovered a few days ago that I had much higher resolution versions of the three scans so far released than I remembered having (i.e. 4MP rather than 1MP), which gave me a good-sized shove to put a proper page up for them. Also, a big hat tip to Edward B for asking if I had anything so useful as decent-sized scans. 😉

But that has also prompted me to revisit the issue of why the Blitz Ciphers aren’t apparently trivially crackable with normal crypto tools (e.g. zkdecrypto etc). And that prompted me to think again about how to go about detecting / predicting nulls.

As always, I had a look for null detection algorithms on the Internet, just in case there was some kind of magical analytical framework out there that I had somehow missed (there’s nothing in Cryptool etc). The closest mention I found was a 2005 message from Brian Tawney on the Voynich mailing list that basically described his independent version of my own homebrewed null character detector: which basically comes down to comparing the distribution of letters preceding / following nulls and seeing how closely that distribution approximates the context-less distribution of letters in the message.

For what it’s worth, my implementation of this hack predicts that B / D / E / M / S are the glyphs in the Blitz Ciphers most likely to be nulls: while another null-detector hack I wrote (that calculates the per-glyph difference in 1st order entropy if individual characters are removed from the stream) predicts that E / B / M / C / S / D may well be nulls.

So far so reasonable, you might initially think. But from my perspective, the problem with this is that nulls also behave a lot like vowels, in that they sit comfortably next to many different other letters / glyphs, can occur quite often within a ciphertext, and tend not to contain much information (in the Shannon entropy sense of the word). So I’m very far from convinced that I could tell nulls from vowels, or even from high-frequency nomenclature tokens (such as “the”, “and”, or perhaps “Freemason” 😉 ).

I might be wrong, but arguably the biggest theoretical problem with both of these hacks is that I think we would need to feed them a substantially larger ciphertext (I’m guessing 10x larger?) to get properly significant results in the presence of other cipher mechanisms (e.g. homophonic equivalents). Whereas a decent simple substitution cipher cracker can have a decent go at breaking a monoalphabetic cipher with as little as, say, 30 characters… so it may be clever, but it seems an order of magnitude cruder than proper cryptanalytical kit.

So… where are all the null detection algorithms? It seems to me that the cryptanalytical tools written these days are more focused on the statistical nuances of computer-era cryptography, while old school trickery (such as nulls and homophones) gets relatively little attention outside of the Zodiac Killer Ciphers world. Maybe there just aren’t any out there.

…unless you know better? 🙂

Posted in: Blitz Ciphers

27 thoughts on “The Blitz Ciphers and null detection algorithms…”

Job on December 3, 2013 at 5:56 am said:

What sort of results do you get if you run your null detection algorithms on the Voynich?

Nulls are so discouraging, they force you to discard a substantial portion of the text. It would be so much better if the full text was readable.

Plus, with the ambiguity they introduce, it’s a bit like anagrams. Any proposed decoding mechanism is automatically less credible.

I wouldn’t be surprised if null-detection happened to be NP-Complete. It’s a bit like Subgraph Isomorphism, in the sense that we are searching for the largest subset of cipher characters that can be decoded using a trivial decryption method – with the assumption that no one would bother to use both nulls and a complex decoding mechanism.

That might explain the absence of a null-detection framework. IMO, when it comes to tough ciphers like these, we should pick a language L (e.g. Italian), a trivial decoding mechanism M (e.g. simple substitution, or nothing at all) and then do some crunching in an attempt to prove that the cipher does not consist of L+M+Nulls. Then proceed to eliminate each such pairing, one by one.

I would be satisfied to know that the Voynich is not just italian with a replaced alphabet and scattered nulls. That’s a step forward.
T Anderson on December 3, 2013 at 9:35 am said:

Nick, you have a good point that old fashion methods are much less studied these days. I also agree with Job about the difficulty. To my knowledge there are null cipher tools in some steganography frameworks, i have no idea how suitable they are.

The blitz cipher(s) may yield to further computational analysis. On the other hand, for the VMS we are missing the a-ha, not saying it doesn’t have nulls, just that the context rules would need some sort of consistency.
SirHubert on December 3, 2013 at 9:51 am said:

What an interesting post and reply.
Nulls do disrupt letter frequency analysis – at least, by brute force, in that you can’t just count instances of letters in a ciphertext and compare with known letter frequencies. But they don’t actually alter the underlying distributions within the plaintext.

To some extent, nulls make common digraphs harder to recognise because you can space out the letters in ‘the’ with enciphered blanks. But again, they don’t change the order of the letters themselves: ‘h’ still follows ‘t’ and ‘e’ follows ‘h’. There’s still an underlying pattern.

I suspect that you or I could break a reasonably long monoalphabetic substitution cipher with four or five nulls, using a pencil and paper, without a great deal of difficulty. At least some letter pairs would still be recognisable, and frequency analysis would get you a fair way. Actually, I think there was one in Simon Singh’s book.
A cipher using lots of nulls – imagine that upper case letters were significant and lower case ones nulls, used more or less equally – now, that would be a very different matter. But that would work by randomizing the text, and whatever the Voynich text is, it’s certainly not random.
Diane on December 3, 2013 at 10:43 am said:

Job, Nick –

has there ever been an example of characters used in alternate orientation? i mean: every second time “u” appeared, it would look like “n”. Or: in an odd-numbered position to sound ‘u’ but if in an even one, ‘n’.

Would it be easiiy broken?
nickpelling on December 3, 2013 at 1:37 pm said:

Diane: that’s still just a homophonic cipher (i.e. multiple different glyphs used to encipher single plaintext letters), albeit one with a secondary set of arbitrary placement rules, and hence should yield to normal cipher attacks.
Diane on December 3, 2013 at 2:03 pm said:

Thax
🙂
nickpelling on December 3, 2013 at 2:29 pm said:

Job: in my opinion, Voynichese’s compact alphabet and heavily structured internal syntax / morphology provides no obvious reason for us to suspect nulls in it – by way of comparison, the Blitz Cipher’s alphabet has ~50 glyphs, which is probably a sign that something extra is going on, thought quite what that something is remains to be determined. 🙂
nickpelling on December 3, 2013 at 2:37 pm said:

SirHubert: all good points and agreed, but the point I was making was simply that I think it reasonably likely that we have a combination of techniques going on here – perhaps some nulls, some homophones and some nomenclature, but all at the same time. If we could run the text through some kind of analytical machinery and say ‘these letters could well be vowels’, ‘these letters could well be nulls’, etc (and perhaps even why this is the case), then it would be an extraordinarily good starting point for trying to break the ciphers in a more systematic way.

For all the talk of science in the Copiale Cipher story, most of the progress they made came from making workable guesses over a long period of time rather than being properly “scientific” per se – I was just hoping to use a little bit of actual cryptanalytical science to get started with this on some kind of sound footing, so I was genuinely surprised when I couldn’t find any.

Of course, Brigadier Tiltman would probably have broken this in his sleep. 🙂
Job on December 3, 2013 at 9:35 pm said:

As far as the Voynich goes, there isn’t much room for null characters, but null words are plausible.

On the other hand, a null-word scheme could be as complex as we want it to be. It could lead to shoehorning, e.g.:

1. A word is null if it starts or ends with a specific character.
2. A word is null depending on position.
3. A word is null depending on a combination of characters.
4. A word is null depending on structure.
5. A word is null if i want it to be.

It’s much less of a triumph: “if you were a real cryptanalyst you would decode the whole thing”.
bdid1dr on December 4, 2013 at 4:48 pm said:

Gregg Shorthand (which would have been in use during Currier and Tiltman’s time) had several characters which represented a combination of syllables such as “sim-lar”. Since I am a “dmy” when it comes to cryptology, can you explain the purpose and use of “null”s in ancient manuscripts?

If, today, I were to write in Gregg Shorthand Mary d’Imperio’s last name, it would appear as three Gregg characters. Oh, I hated taking dictation because it required me to read the speaker’s lips while scribbling in Gregg — right off the edges of the paper I was writing on!
xplor on December 4, 2013 at 7:23 pm said:

Warren Weaver would have had a good nights sleep and found the translation on his desk in the morning in English. Has anyone checked for Linguistic universals in the Voynich?
nickpelling on December 4, 2013 at 7:31 pm said:

xplor: many, many times, and there isn’t even any kind of consensus on where the vowels are, let alone anything else. So then they go all abjaddy, proto-thisly or proto-thatly, or (spare my soul for even typing the word) polyglot linguistic-y. What a waste of time. 🙁
xplor on December 4, 2013 at 10:26 pm said:

Is it possible the programs that could solve these concerns are still classified? The Information Sciences Institute is looking for someone to work on the Voynich.
thomas spande on December 4, 2013 at 11:32 pm said:

Nick et al., Armenians commonly doubled a glyph to encode for a single glyph. So a rough and ready test of the VM cipher is to look for triplets. I think with with 3 “c”s in a row, the last is a null. As is “ccc” I think cc is Latin “e” and third “c” is a null: likewise in “c-c c” (means “i”) or “c-c*c” where c-c* indicates c-c with a curlicue above the line and I think is a Latin “u”. So the c-c combos are all vowels and the gallows are all consonants. “e” is seems to me not a null, but rather has two forms (“8” and “cc”). The Latin “i” appears also along with “c-c” and is not redundant but two glyphs with the same meaning. The strange glyph that resembles an “8” with a rocker at the base, is I think a “full stop” for a sentence and amounts also to a null. No algorithims, just inspection. Undoubtedly, this will prove too simplistic but so far some decrypts seem to work. Cheers, Tom
bdid1dr on December 16, 2013 at 6:32 pm said:

Nick, do you remember our Blitz Cipher correspondence of a year or so ago where basically we ended up with the “blueprint” of the layout of the tile work for Colonel North’s Turkish bath in his mansion? At that time, I recall, you would be following up with a visit to Greenwich University’s archives (which apparently were stored in a facility on the Isle-of-Dogs)?
nickpelling on December 16, 2013 at 6:50 pm said:

bdid1dr: that it was somehow blueprints to Colonel North’s amazing (but long-destroyed) Turkish baths was your conclusion at that time, but it was alas not mine, not by a long way. But each to their own!
bdid1dr on December 17, 2013 at 12:25 am said:

Well, considering that you received that puzzle from an employee of Greenwich University (incognito) who wished for his identity not be disclosed — I guess my “educated guess” as to that page (which was retrieved from the Teaching College for Women, and which suffered damage from WW II bombing “Blitz”, and eventually became the “North Campus” of Greenwich University — all documented by the current administration) will have to remain a guess rather than provide a “cipher” solution. Would you be able to still contact the person who presented that puzzle to you? Perhaps Diane can recall some of our earlier discussions?
nickpelling on December 17, 2013 at 7:31 am said:

bdid1dr: there was no link with Greenwich University, as I recall.
bdid1dr on December 17, 2013 at 5:50 pm said:

Thanx, Nick, I’ll be trolling G. U.’s “North Campus” blog about their Library annex which incorporated some elements of Colonel North’s “greenhouse/turkish bath” structure. Perhaps they’ve expanded on their earlier discussions (I hope).
bdid1dr on December 19, 2013 at 12:21 am said:

And maybe we can even find some manuscripts in their collection?
😉
bdid1dr on December 19, 2013 at 5:53 pm said:

It was the City of London which sold the “North” property to Greenwich University, after the women’s school had been derelict for quite a while (litigation?). I wasn’t able to determine when the “Blitz” damage occurred (before or after the sale of the property to Greenwich).
That mystery diagram was instructions for tiling the Turkish bath which was separated from the mansion by a glass greenhouse which was heated by hot water/condensed steam pipes. There is a “post-card” photograph of the tiled bath on another blog. Quite eerie.
bdid1dr on December 19, 2013 at 9:33 pm said:

ThomS, in re the “loop” above the paired “C’s” : The loop represented some missing alpha characters between those paired ‘c’ s. Example: B-408-f-35-r: line 4: “a-qu-o-ell-ce-geus”
“c-ro-ce-ell-e-ce-geus” — two Latin words which can be sounded out, and which definitions can be found in a Latin-English dictionary — independent of ANY illustration/drawing. Of course, the illustration was quite “illustrative” in depicting the stigmas of the saffron crocus as being the most important part of that plant.
I’m not too sure that Colonel North ever tried to grow saffron crocuses in his elaborate greenhouse/turkish bath, but I do know that Britishers loved all aspects of gardens. The saffron crocus is a beautiful flower. I now have some 300 saffron crocuses emerging on east, south, and west sides of my mountain home.
bdid1dr on December 21, 2013 at 5:52 pm said:

Some really confused dialogue on this post. Am I confused about the subject being the Blitz Cipher — & the use of “null-detection-algorithms” on that one-page document which apparently survived the “Blitz” and ended up in the archives of Greenwich University?
I’m doing my best to stick with whatever puzzle presentation placed post-wise.
Perseveringly yours,
beady-eyed wonder
nickpelling on December 21, 2013 at 8:30 pm said:

bdid1dr: I’m sorry, but it’s you who have got confused here – there has never been any evidence linking the Blitz Ciphers with Greenwich University.
bdid1dr on December 21, 2013 at 10:23 pm said:

Thanx Nick, I shall not pursue this subject matter any further. Have some 1-dr-ful holidays!
bd
bdid1dr on December 23, 2013 at 7:52 pm said:

BTW, Diane: Thnx 4 ur “thax” sign-off on your earlier post on this page. 2 yr gd hlth ! 🙂
nickpelling on November 14, 2016 at 1:22 pm said:

bdid1dr: in London, you can find new housing development on just about any green space big enough for a puppy to cock a rear leg at.