According to Stu Rutter’s latest raid on the National Archives, the Allies’ WW2 “Typex” cipher used a single five-letter indicator, placed both at the start and end of messages [so says WO 208/5109, anyway]: and so he concludes that Typex was very probably the system used by our wonderfully mysterious WW2 pigeon cipher. Having said that, I do wonder whether the first five plaintext letters will turn out to be “QQQQQ”, as I recall that many messages had this dummy text group added at the start to avoid stereotyped messages, even though it was itself an even more stereotypical sequence. Perhaps combining this with the ciphertext might let us work back to the rotor contents and settings… just a thought!

[Typex remains a great working hypothesis, though personally I’d still like to see how the Air Support Syllabic Cipher (War Office document BX 724) and Royal Engineer Syllabic Cipher (War Office document BX 724/RE) worked. But that’s another story!]

Intriguingly, GCHQ’s archives holds a Typex document which Stu would understandably like to get access to: and at last weekend’s Big Bang Science Fair at London’s ExCel venue (which my son thought was really fantastic), I was very pleasantly surprised to bump into the GCHQ Historian hard at work on the GCHQ stand, busily helping children encipher their own Enigma messages for Bletchley Park’s rebuilt Bombe to try to crack. He told me that GCHQ releases documents more according to security-related criteria than in response to Freedom of Information requests: and even though he would send us through the appropriate paperwork to fill out, we should necessarily be somewhat patient… it’s no secret that it’s not the fastest of processes (for example, they released the last Enigma file only last year). Fingers crossed that all goes through!

Incidentally, the Americans didn’t think Typex was properly fit for purpose, sniffily describing it as “nothing but a glorified German Enigma, with 5 rotors instead of 3 and with arrangements for printing” (NARA: RG 457 HCC Box 804 NR 2323, quoted in Ratcliff “Delusions of Intelligence”, p.167), while British cryptologists also saw flaws in Typex “as early as 1940”, though their recommendations as to how to work around them seem to have been acted upon (“Delusions”, p.179). Yet even though the Germans knew exactly how Typex worked, they had “abandoned work on it” prior to 1942, presumably because of its structural similarity to their own ‘unbreakable’ Enigma variants (“Delusions”, p.178 and p.202).

But here’s something to do with Typex that’s rather interesting (and more social history than overtly cryptographic) which I liked, and think you may well like too. 🙂

Having posted a few days ago on the British Army’s pervasive use of ciphers for pigeon messages, I was intrigued to read about the Army “cipher room” at Arundel Castle mentioned by Bill Button: and so decided to snoop around the web for further mentions of WW2 cipher rooms. The nicest things I dug up by far were three reminiscences made by Jessie Dunlop in 2004 (courtesy of her daughter Ann Wild) on the BBC’s “People’s War” website. Rather delightfully, these described her wartime cipher experiences, firstly at Low Grade Cipher School in Eccleston Square, secondly at High Grade Cipher School in Half Moon Lane, London, and then finally at SHAEF Supreme headquarters in an Underground tunnel between Goodge Street and Warren Street Station, at which time she met her future husband Jack.

Confusingly, she misremembered Typex as “xyco” (which is why these posts didn’t show up in web searches), but that’s entirely to be expected – it was a very long time ago, after all. In a follow-up comment from 2004, she further described how xyco / Typex was used:-

“I think it was modelled on the Enigma. It had several drums in the top with a lid to be lifted to reach these. The first one was static and was set each day with the beginning of the day’s code. The rest were also set each day but they revolved. A keyboard like a typewriter was below these and on this the message was typed in. It came out in groups of letters, I think. Sometimes we could add what was called a scrambler, an electrical gadget which we plugged in if the the message was top secret. This was indicated at the end of the message in the code.”

And so it would seem that in the pigeon cipher, we’re looking at
* an enciphered Army message (quite possibly in Typex);
* not sent during Operation Overlord (i.e. not on D-Day or shortly after); and
* not top secret (and hence not using any kind of scrambler).

This is really useful, because it probably means that Stu Rutter need not worry about scramblers or reflectors (I think): for if we are looking at a non-top-secret Typex message, it probably wasn’t using a scrambler. So as long as he has an accurate copy of the the contents and structure of the rotors and the way the Typex worked, who’s to say that Stu’s JavaScript simuator won’t be able to give us the answer? If so, it might arguably be the first Typex message ever decrypted by anyone… and how cool would that be? 😉

Incidentally, one great sanity check might be to ask the Royal Signals Museum in Blandford, Dorset if they could use their Typex machine to encipher some test messages with various rotor settings to validate Stu’s simulator. In return, perhaps they might like to have his simulator on display next to their machine, so that visitors can try it out for (virtually) real? That would be good for everyone, I think. 🙂

Anyway, here’s a question for you all: how can we find out if Jessie Dunlop – or indeed anyone else who worked on British Army High Grade Ciphers, whether in SHAEF or elsewhere – is still alive? Perhaps having her looking at our pigeon message might trigger some memories of how it all worked. Something to think about! 🙂

When “X. Lamb” unexpectedly announced that the “Tamam Shud” Unknown Man was a certain “H. C. Reynolds” (whose merchant seaman’s ID card she had), I’m sure that she was utterly convinced of the truth of what she was claiming, and that she believed it was simply a matter of time before evidence properly supporting it would emerge.

Indeed at first sight, it seemed both to me and others as though it ought to be fairly straightforward to test her claim. After all, we had a very specific data point to work with (admittedly surrounded by a whole load of media and online speculation, most of it unhelpful and distracting) – a name, a face, a date and a place of birth (Hobart, Tasmania).

Eventually, thanks mainly to solid work from Cheryl Bearden, we determined that this “H. C. Reynolds” was born in February 1900 and had the middle name Charles (which he clearly preferred to his as-yet-unknown first name “H[—–]”): and we were able to reconstruct his brief career as a merchant seaman working for the Union Steam Ship Company, the “Southern Octopus”. It was clear that this Reynolds was no fantasy, but a real flesh-and-blood person: and so, in theory, all we had to do was dig up a link between his maritime career and his life on land, and bingo – all his life would be spread before us.

Pursuing this fairly slender reed of a lead yet further, I managed to discover (from his employee records) his exact date of birth (8th February 1900): and, from the ever-useful “Log of Logs”, that ships’ logs for two of the three ships Reynolds worked on could be found in two different Australian archives. Very kindly, both Diane O’Donovan and John Kozak took the time to go and look at these two log books (one each), and found… nothing. Nada. Zero. And that, I strongly suspected at the time, was going to prove the end of the whole affair: for whatever reason, this H. Charles Reynolds seemed doggedly determined to stay just out of our archival reach. It felt hard not to conclude that we’d never be able to convincingly prove or refute X. Lamb’s assertion that he was the Unknown Man found mysteriously dead on Somerton Beach on 1st December 1948.

Frustratingly, it had been reported early on that a similarly-named-but-apparently-quite-different H. C. Reynolds (a “Horace Charles Reynolds”) had been born in Triabunna on Tasmania on 12th February 1900. But once we knew that “our” H. C. Reynolds had been born on a different day in the same month fifty or so miles away in Hobart, this was a fact that became pigeonholds in the ‘curiously coincidental but annoyingly unhelpful‘ category. And anyway, it was also reported that this particular Horace Charles Reynolds had been a poultry farmer, and that (when asked) his family didn’t believe that he had ever gone to Adelaide, let alone gone to sea. Oh well. 🙁

Step forward Debra Fasano: though a little late to the whole H. C. Reynolds party, she carved a path through the fuzz of uncertainty straight to an extremely reliable source – the “Tas BDM” (Tasmanian Births, Deaths, and Marriages) indexes on CD. And the entry she found there turned the whole story round:-

Tasmanian Federation Index 1900-1930 (CD)
Author: Macbeth Genealogical Services
Year: 2006
ISBN: 1920757082

Surname: REYNOLDS
Given name: Horace Charles
Event: Birth
Father: Edwin REYNOLDS
Mother: Mary Ann Matilda BAYLEY
Date: 8 Feb 1900
Sex: Male
Place: Davey Street, Hobart
Registration Number: 200

And with that, all the pieces finally start to fall into place. There weren’t two Horace Charles Reynolds-es born in or near Hobart in February 1900: there was, without much doubt, just the one. Debra adds:

“When looking for Horace’s birth I had a good search of the indexes and couldn’t find anyone else with a similar name, initials, or anything else that might be relevant. I am really strict about evidence and I do think that he is the person who was working as a Purser.”

As to when this Horace Charles Reynolds died, there’s a death notice in the Hobart Mercury (18 May 1953), which seems very probably the same man:-

REYNOLDS. -Suddenly, on May 16, 1953, at a private hospital, Hobart, Horace Charles Reynolds, late of Brookvale, New South Wales, aged 53 years. Private cremation.

We knew that our H. C. Reynolds was born in Hobart and got his first job in Hobart: and from this notice, it seems almost undeniable that Hobart was where he died too.

I say “almost”, because there are a few matters that remain half-open, not least of which is the matter of Reynolds’ family apparently denying that the photo was of him. I wonder, though: had someone seen a quite different Horace Reynolds from Wooroloo who died in 1954 (as per The West Australian Monday 15 March 1954 p 30) and put the two stories together? That particular Horace Reynolds was born 10th April 1903, was NX69883 in the 2nd AIF, and was a farmer married to Elizabeth. My guess is that he will turn out to be the “poultry farmer” mentioned very early on, someone quite different to the one we were actually looking for.

The Tasmanian Horace Charles Reynolds appears to have had no children: but if even if didn’t marry, it’s entirely possible that we could trace his immediate family right to the present day and perhaps ask them if we could find a photo of him – after all, 1953 wasn’t really so very long ago, was it?

Debra Fasano notes that Reynolds had two older brothers:
* Oswald Bayley Reynolds (b. ~1891) was a billing clerk who rose to become a senior bank administrator.
* Archibald Henry Reynolds (b. 1895) was (according to the 1930 and 1933 NSW electoral rolls) a clerk living in Carter Road at Brookvale in NSW.

I also noticed in Trove that Mrs Edwin Reynolds stepped down as Treasurer of her local Triabunna town committee in 1898, so it should perhaps come as no great surprise that Horace Charles Reynolds started out as an Assistant Purser, for he came from a veritable family of clerks. (Or do I mean “a fastidity of clerks”? I never can remember collective nouns).

Finally, Debra notes that a “Charles Reynolds” was also living in Carter Road in the 1930s, and working as (you guessed it) a clerk. Given that we know that our H. Charles Reynolds was already signing himself “Charles Reynolds” by 1919, and that the Horace Charles Reynolds who died in 1953 had been living in Brookvale, what are the odds that these are all pieces of the same cussedly consistent jigsaw? If there is a chink somewhere in this logical chain-mail armour, I have to say that I can’t currently see it.

Anyway, I’ve already been told off once this week for a ‘TL;DR’ (“Too Long; Didn’t Read”) post, so I’d better bring this to a close here. Perhaps someone will be able to use these details to ferret out a living relative of the various Reynolds brothers, and perhaps try to dig up a separate photograph of Horace Charles Reynolds to independently test this whole narrative. It would be nice to get proper closure on this, even if it isn’t quite the result some may have hoped for.

By the way, if you do decide to try to trace this all the way to the end, Debra suggests a number of surnames connected with the Reynolds family that may be of assistance:-

LESTER
VALENTINE
SHEA
SPENCER
FLETCHER
DENNE
ROLSTON
PAGE
TATE
MULLANE
LEVY
ALOMES
HARDY OR HARDING

Happy hunting! 😉

ricky-mccormick-cleaned-up

The story of how Ricky McCormick was found dead with two (apparently enciphered) notes in his pocket hit the news a while back, but I hesitated to write it up as a cipher mystery at the time because I didn’t think the media coverage was even remotely reliable. But revisiting the whole affair recently, I found a simply splendid online article courtesy of the River Front Times called “Code Dead” (by Christopher Tritto), which turned my opinion of the whole case right round.

This revealed…
* that McCormick had just travelled back from Florida, from where he had allegedly brought back baseball-sized zip-lock bags of marijuana for Baha Hamdallah, brother of the owner of the gas station where McCormick worked.
* that he was closely associated with some violent (if not actually sociopathic) individuals, such as Gregory Knox
* that the stretch of road his body was found on was used for dumping dead bodies both before and after his death
* that the FBI’s Cryptanalysis and Racketeering Records Unit (CRRU) sat on the two mystifying notes for 12 years before announcing their existence
* that McCormick’s family knew nothing about the notes until they heard them mentioned on the news. (“Now, twelve years later, they come back with this chicken-scratch shit.”)

Moreover…
* McCormick fathered two children with a girl he called “Pretty Baby” before she was 14 (for which he went to prison)
* he experienced chest pains and shortage of breath the week before he died, severe enough for him to check into ER. (Though admittedly he had smoked “at least a pack of cigarettes a day” since he was ten, and typically drank “more than twenty caffeinated beverages a day”).
* McCormick could hardly read or write when he left school. (“The only thing he could write was his name”, and that Ricky “couldn’t spell anything, just scribble.”)

Coincidentally, everyone’s favourite crypto-gal Elonka Dunin lives close to where McCormick’s body was left, and she’s taken an interest in the cipher mystery aspect of the case, even doing a video interview for the River Front Times explaining how monoalphabetic substitution ciphers work (not that that’s what we’re looking at here, *sigh*). But having learnt more about McCormick’s background and situation, she concludes “I don’t think McCormick wrote these notes”, and that “[P]erhaps he was a courier.”

(If you haven’t seen the notes before, the two thumbnails below link to decent quality scans of them – well worth opening up in a browser to see what all the fuss is about.)

note1_small

note2_small

So, what *are* we looking at here? Well, the Internet (as always) has plenty of commentary to wade through. The CRRU’s Dan Olson points out that “There are many E’s… that could be used as a spacer”: while Elonka notes the plethora of patterns periodically peppering the pages (such as “WLD”, “NCBE”, “SE” etc). There are also lots of bracket pairs (which have somehow led to the suggestion that it may in part be lists) as well as punctuation marks, most notably an apostrophe, which would loosely imply that the word preceding it (“WLD”) may well be a noun.

Olson seems convinced that the writer of the notes was ingenious and calculating, while Elonka too appears to think that they are of a complexity that would have been beyond McCormick’s abilities. Respectfully, I have to disagree: for I suspect that the main key to the notes’ impenetrability lies not in paranoia or secrecy but in a probable explanation for why McCormick failed school (and, conversely, why school failed McCormick) – dyslexia.

Look again at three highly structured consecutive lines from the notes:
first-second-third

To me, this looks a lot like a mixed-up version of:-
* FIRSE PERSON D 71 NCB[E]
* SECND PERSON’S D 74 NCB[E]
* TRD’S PERSON R D 75 NCB[E]

Specifically, I think “NCB” will turn out to be a local address in St Louis (maybe even initials for Clinton Peabody?) – and if that’s right, why would the numbers not be the flat / house numbers of people buying drugs? McCormick preferred moving round at night (like “a vampire”), and he carried and held big bags of marijuana from Orlando for Baha Hamdallah (according to McCormick’s girlfriend), so the suggestion that he might have been some kind of small-time drug runner or dealer probably isn’t totally wild.

I don’t know, though: it’s all just awful. Victorian-era historians saw their job as weaving narratives around Events In History for the moral edification and correct instruction of Society In General, and even many moderns would find it journalistically tempting to take McCormick’s life of denial and ignominious death as launching pads for some glib commentary on a whole set of social macro-epidemics – guns, drugs, poverty, social inequality, education, dyslexia, whatevuh.

But all I’m actually left with is a feeling of deep sadness – that what we’re glimpsing into in these two notes is the life of a poor, illiterate guy who aspired to ride the horse of opportunity, but only ever got dragged behind it.

So, what strikes me most powerfully is that quite unlike other cipher mysteries, I don’t actually want to read what was written on McCormick’s two notes. I understand people often feel a deep-seated need for closure, but does any kind of (capital-j) Justice have the power to right the wrongs of these slow-motion train-wrecks?

pigeon-head

Now here’s an interesting thing. I’ve just read “From El Alamein to the Alps with Pigeons” by Bill Button (who used to write a pigeon column for The Racing Pigeon Weekly under the name “Uno Solo”), which relates – you’ll be unsurprised to hear – his WW2 experiences running pigeon lofts in North Africa and Italy.

I rather enjoyed it, because it brought across a lot of the feeling Sigm (signalmen) had for their pigeons. If a pigeon arrived back injured, they did their best to sort it out and patch it up with whatever they had to hand: and we tend to forget that the Axis aside, war pigeons perpetually had to deal with the threat of their Other Enemy… hawks, hungry for a slice of pigeon pie (though without the pie). No wonder they often flew faster than a mile every minute. 😉

However, for our purposes, page 1 tells us something simple and straightforward that changes our basic perspective on the problem we face. Early in the war, Button had been drafted to a civilian loft in Hurstpierpoint (in West Sussex) owned by a Mr Greer, which supplied 6-12 birds to the Armed Forces (normally the Army) ever 2-3 days:-

When the birds returned to the ‘home’ loft we had to deliver the messages to what I believe was Arundel Castle. Although we saw many of the messages, they meant nothing to us, having been written in cipher. On arrival as the Castle, we had to report to the cipher room and hand over the messages. Entry was forbidden.

And so there you have it. Contrary to what you might think, the British Army sent pigeon messages in cipher throughout the war. Hence the whole romantic notion that what we are looking at could only have been sent in high desperation from France on D-Day rather evaporates… it could have been sent pretty much any time from late 1940 onwards, and for one or more of a whole panoply of reasons.

In fact, because [as Mike Moor helpfully pointed out (and more on that another time)] pigeon pads sent out to the British Army by Wing House for D-Day were overstamped “OPERATIONAL MESSAGE – Telephone to War Office Signal Office, WHITEHALL 9400“, there is a strong case to be made that D-Day is in fact the one day this message could not have been sent.

Thus does History iterate slowly towards a better picture of what actually happened. 🙂

USC’s irrepressible Kevin Knight and Dartmouth College Neukom Fellow Sravana Reddy will be giving a talk at Stanford on 13th March 2013 entitled “What We Know About the Voynich Manuscript“. Errm… which does sound uncannily like the (2010/2011) paper by the same two people called, errrm, let me see now, ah yes, “What We Know About the Voynich Manuscript“.

Obviously, it’s a title they like. 🙂

As I said to Klaus Schmeh at the Voynich pub meet (more on that another time), what really annoys me when statisticians apply their box of analytical tricks to the Voynich is that they almost always assume that whatever transcription they have to hand will be good enough. However, I strongly believe that the biggest problem we face precedes cryptanalysis – in short, we can’t yet parse what we’re seeing well enough to run genuinely useful statistical tests. That is, not only am I doubtful of the transcriptions themselves, I’m also very doubtful about how people sequentially step through them, assuming that the order they see in the transcription of the ciphertext is precisely the same order used in the plaintext.

So, it’s not even as if I’m particularly critical of the fact that Knight and Reddy are relying on an unbelievably outdated and clunky transcription (which they certainly were in 2010/2011), because my point would still stand regardless of whichever transcription they were using.

In fact, I’d say that the single biggest wall of naivety I run into when trying to discuss Voynichese with people who really should know better, is that hardly anyone grasps that the presence of steganography in the cipher system mix would throw a spanner (if not a whole box of spanners) in pretty much any neatly-constructed analytical machinery. Mis-parsing the text, whether in the transcription (of the shapes) and/or in the serialization (of the order of the instances), is a mistake you may well not be able to subsequently undo, however smart you are. You’re kind of folding outer noise into the inner signal, irrevocably mixing the covertext into the ciphertext.

Doubtless plenty of clever people are reading this and thinking that they’re far too smart to fall into such a simple trap, and that the devious stats genies they’ve relied on their whole professional lives will be able to fix up any such problem. Well, perhaps if I listed a whole load of places where I’m pretty sure I can see this happening, you’ll see the extent of the challenge you face when trying to parse Voynichese. Here goes…

(1) Space transposition cipher

Knight and Reddy are far from the first people to try to analyze Voynichese word lengths. However, this assumes that all spaces are genuine – that we’re looking at what modern cryptogram solvers call an “aristocrat” cipher (i.e. with genuine word divisions) rather than a “patristocrat” (with no useful word divisions). But what if some spaces are genuine and some are not? I’ve presented a fair amount of evidence in the past that at least some Voynichese spaces are fake, and so I doubt the universal validity and usefulness of just about every aggregate word-size statistical test performed to date.

Moreover, even if most of them are genuine, how wide does a ciphertext space have to be to constitute a plaintext space? And how should you parse multiple-i blocks or multiple-e blocks, vis-a-vis word lengths? It’s a really contentious area; and so ‘just assuming’ that the transcription you have to hand will be good enough for your purposes is actually far too hopeful. Really, you need to be rather more skeptical about what you’re dealing with if you are to end up with valid results.

(2) Deceptive first letters / vertical Neal keys

At the Voynich pub meet, Philip Neal announced an extremely neat result that I hadn’t previously noticed or heard of: that Voynichese words where the second letter is EVA ‘y’ (i.e. ‘9’) predominantly appear as the first word of a line. EVA ‘y’ occurs very often word-final, reasonably often word-initial (most notably in labels), but only rarely in the middle of a word, which makes this a troublesome result to account for in terms of straightforward ciphers.

And yet it sits extremely comfortably with the idea that the first letter of a line may be serving some other purpose – perhaps a null character, or (as both Philip and I have speculated, though admittedly he remains far less convinced than I am) a ‘vertical key’, i.e. a set of letters transposed from elsewhere in the line, paragraph or page, and moved there to remove “tells” from inside the main flow of the text.

(3) Horizontal Neal keys

Another very hard-to-explain observation that Philip Neal made some years ago is that many paragraphs contain a pair of matching gallows (typically single-leg gallows) about 2/3rds of the way across their topmost line: and that the Voynichese text between the pair often presents unusual patterns / characteristics. In fact, I’d suggest that “long” (stretched-out) single-leg gallows or “split” (extended) double-leg gallows could well be “cipher fossils”, other ways to delimit blocks of characters that were tried out in an early stage of the enciphering process, before the encipherer settled on the (far less visually obvious) trick of using pairs of single-leg gallows instead.

Incidentally, my strong suspicion remains that both horizontal and vertical Neal keys are the first “bundling-up” half of an on-page transposition cipher mechanism, and that the other “unbundling” half is formed by the double-leg gallows (EVA ‘t’ and ‘k’). That is to say, that tell-tale letters get moved from the text into horizontal and vertical key sequences, and replaced by EVA ‘t’ (probably horizontal key) or EVA ‘k’ (probably vertical key). I don’t claim to understand it 100%, but that would seem to be a pretty good stab at explaining at least some of the systematic oddness (such as “qokedy qokedy dal qokedy qokedy” etc) we do see.

Regardless of whether or not my hunch about this is right, transposition ciphers of precisely this kind of trickiness were loosely described by Alberti in his 1465 book (as part of his overall “literature review”), and I would argue that these ‘key’ sequences so closely resemble some kind of non-obvious transposition that you ignore them at your peril. Particularly if you’re running stats tests.

(4) Numbers hidden in aiv / aiiv / aiiiv scribal flourishes

This is a neat bit of Herbal-A steganography I noted in my 2006 book, which would require better scans to test properly (one day, one day). But if I’m right (and the actual value encoded in an ai[i][i]v group is entirely held in the scribal flourish of the ‘v’ (EVA ‘n’) at the end), then all the real content has been discarded during the transcription, and no amount of statistical processing will ever get that back, sorry. 🙁

(5) Continuation punctuation at end of line

As I noted last year, the use of the double-hyphen as a continuation punctuation character at the end of a line predated Gutenberg, and in fact was in use in the 13th century in France and much earlier in Hebrew manuscripts. And so there would seem to be ample reason to at least suspect that the EVA ‘am’ group we see at line-ends may well encipher such a double-hyphen. Yet even so, people continue to feed these line-ending curios into their stats, as if they were just the same as any other character. Maybe they are, but… maybe they aren’t.

Incidentally, if you analyze the average length of words in both Voynichese and printed works relative to their position on the line, you’ll find (as Elmar Vogt did) that the first word in a line is often slightly longer than other. There is a simple explanation for this in printed books: that short words can often be squeezed onto the end of the preceding line.

(6) Shorthand tokens – abbrevation, truncation

Personally, I’ve long suspected that several Voynichese glyphs encipher the equivalent of scribal shorthand marks: in particular, that mid-word ‘8’ enciphers contraction (‘contractio’) and word-final ‘9’ enciphers truncation (‘truncatio’) [though ‘8’ and ‘9’ in other positions very likely have other meanings]. I think it’s extraordinarily hard to account for the way that mid-word ‘8’ and word-final ‘9’ work in terms of normal letters: and so I believe the presence of shorthand to be a very pragmatic hypothesis to help explain what’s going on with these glyphs.

But if I’m even slightly right, this would be an entirely different category of plaintext from that which researchers such as Knight and Reddy have focused upon most… hence many of their working assumptions (as evidenced by the discussion in the 2010/2011 paper) would be just wrong.

(7) Verbose cipher

I’ve also long believed that many pairs of Voynichese letters (al / ol / ar / or / ee / eee / ch, plus also o+gallows and y-gallows pairs) encipher a single plaintext letter. This is a cipher hack that recurs in many 15th century ciphers I’ve seen (and so is completely in accord with the radiocarbon dating), but which would throw a very large spanner both in vowel-consonant search algorithm and in Hidden Markov Models (HMMs), both of which almost always rely on a flat (and ‘stateful’) input text to produce meaningful results. If these kinds of assumptions fail to be true, the usefulness of many such clever anaytical tools falls painfully close to zero.

(8) Word-initial ‘4o’

Since writing my book, I’ve become reasonably convinced that the common ‘4o’ [EVA ‘qo’] pair may well be nothing more complex than a steganographic way of writing ‘lo’ (i.e. ‘the’ in Italian), and then concealing its (often cryptologically tell-tale) presence by eliding it with the start of the following word. Hence ‘qokedy’ would actually be an elided version of “qo kedy”.

Moreover, I’m pretty sure that the shape “4o” was used as a shorthand sign for “quaestio” in 14th century Italian legal documents, before being appropriated by a fair few 15th century northern Italian ciphers (a category into which I happen to believe the Voynich falls). If even some of this is right, then we’re facing not just substitution ciphers, but also a mix of steganography and space transposition ciphers, all of which serves to make modern pure statistical analysis far less fruitful a toolbox than it would otherwise be for straightforward ciphers.

* * * * * * *

Personally, when I give talks, I always genuinely like to get interesting questions from the audience (rather than “hey dude, do you, like, think aliens wrote the Voynich?”, yet again, *sigh*). So if anyone reading this is going along to Knight & Reddy’s talk at Stanford and feels the urge to heckle ask interesting questions that get to the heart of what they’ve been doing, you might consider asking them things along the general lines of:

* what transcription they are using, and how reliable they think it is?
* whether they consider spaces to be consistently reliable, and/or if they worry about how to parse half-spaces?
* whether they’ve tested different hypotheses for irregularities with the first word on each line?
* whether they believe there is any evidence for or against the presence of transposition within a page or a paragraph?
* whether they have compared it not just with abjad and vowel-less texts, but also with Quattrocento scribally abbreviated texts?
* whether they have looked for steganography, and have tried to adapt their tests around different steganographic hypotheses?
* whether they have tried to model common letter pairs as composite tokens?

I wonder how Knight and Reddy would respond if they were asked any of the above? Maybe we’ll get to find out… 😉

Or you could just ask them if aliens wrote it, I’m sure they’ve got a good answer prepared for that by now. 🙂

The Daily Grail has today’s hot cipher history story: that Dan Brown’s soon-to-be-released novel “Inferno” is somehow based around the Voynich Manuscript. Apparently, the proof of this particular pudding is, well, a cipher, one apparently hidden in plain sight on Brown’s website:-

dan-brown-voynich-code

In Rolf Harris’ immortal phrase, “Can you tell what it is yet?” I hope you can, because all it is is… a 4×4 transposition cipher of “MS 408 YALE LIBRARY”. Yes, that’s it. Which is in itself a fairly underwhelming starting point, considering that the Voynich Manuscript isn’t MS 408 in “Yale Library”, but in Yale University’s Beinecke Rare Book & Manuscript Library. But (of course) that wouldn’t fit in 16 letters. 🙂

So, the story of the story is that Dan Brown will once again be wheeling out his “symbologist” Robert Langdon in a Renaissance-art-history-conspiracy-somehow-impinges-on-the-present-day-with-terrible-consequences schtick, but this time in Florence with Dante’s “Inferno” right at the heart of it (hence the title), with only the poor, much-abused Voynich Manuscript for company.

One part I’m not looking forward to is what Brown will have Robert Langdon make of the Voynich: for of all the mysteries I’ve ever seen, the Voynich is surely the least obviously symbol-laden. There’s no “sacred geometry” there, no gematria, no heresy, in fact no religion at all: just about all you could do is tie in the Voynich ‘nymphs’ with the same kind of alt.history “goddess” thing that Brown tried to stripmine in The Da Vinci Code… but all the same, that looks fairly hollow to me. I guess we’ll have to see what angle he does take… at least we won’t have long to wait (14th May 2013).

For me, the central contrast between Dante’s Inferno and the Voynich Manuscript is that they are diametrically opposite in referentiality: while the Inferno (and in fact the whole Divine Comedy) reaches out to touch and even include all of human culture, the Voynich Manuscript’s author seems to have worked with the same kind of monastic intensity to ensure it appears to refer to nothing at all. So, when Dan Brown collides the everything-book with the nothing-book, what kind of po-faced bathos-fest are we in for?

As an aside, I don’t see any numerology in the (original) Inferno: and considering the amount of effort Dante put into satirizing astrologers, alchemists, politicians, liars, frauds and the like in their aptly tortured circles of hell, I’m reasonably sure he’d mete out the same kind of punishment to numerologists. And probably to symbologists, too. And (if we’re lucky) to bad novelists… though you’ll have to put your own candidates forward for that, I’m far too polite. 😉

However, the bit I dread most is when people start to realize that Dante Alighieri’s Inferno was only the first part (of three) of his Divine Comedy: and with the current Hollywood craze for trilogies (The Hobbit trilogy, really?), what are the odds Dan Brown will extend any success with this book out into his own money$pinning Dante-based series, hmmm? The “Ka’chingferno” three-parter, no less!

Update: Erni Lillie upbraids me (and rightly so) in a comment here for omitting to mention his substantial 2004 (though the Wayback Machine only has a copy from 2007) Voynich Inferno essay, where he proposed that the nine “rosettes” on the Voynich Manuscript’s nine-rosette page might well represent the nine layers of Dante’s Inferno. My own experience of working on that particular page would place it closer to Purgatory, but perhaps we’re closer than medieval theologians would have it. 🙂

Truth be told, I did remember that I had forgotten something to do with Dante and the Voynich, but couldn’t for the life of me remember what it was I’d forgotten. And now that I’ve found it again, I was delighted to read it all over again, Renaissance warts and all. So, hoping that it’s OK with Erni to bring his work to a new generation of interested readers, here’s a link to a copy of his paper The Voynich Manuscript as an Illustrated Commentary of Dante’s Divine Comedy. Maybe it will turn out to be what Dan Brown’s new book plagiarizes was amply inspired by this time round, who knows? 😉

Personally, I suspect the smart money is indeed on Brown’s having the Voynich’s nine-rosette map turn out to be a map: with the devastating twist *yawn* that it actually represents a physical map of Dante-related locations in Florence, which Robert Langdon is then able to decode at speed thanks to his encyclopaedic knowledge of all things symbolic and Florentine, which ultimately leads him to the dark secret at the heart of a centuries-old conspiracy which he and his unexpected accomplice must choose whether to reveal to the horror of the world.

You know, basically the same as all his other books. 😉

Anyway, looking forward to the launch party at the Duomo, darling. Of course I’d like more olives, thanks for asking, and isn’t the San Giovese simply, errrm, Divine? 🙂

A paper came out a few days ago on arXiv.org, called “Probing the statistical properties of unknown texts: application to the Voynich Manuscript” written by three Brazilian academics (with assist from two German academics).

The authors grouped Voynichese (i.e. Voynich text) hypotheses into three broad categories:

“(i) A sequence of words without a meaningful message;
(ii) a meaningful text written originally in an existing language which was coded (and possibly encrypted) in the Voynich alphabet; and
(iii) a meaningful text written in an unknown (possibly constructed) language.”

After developing a whole load of word-occurrence-based statistical machinery (defining “intermittency”, etc) and applying them both to real text corpora and to Voynichese, they conclude that the word structure of Voynichese is incompatible with shuffled texts (which is how they model (i)-class hypotheses), and “mostly compatible with natural languages” (the (ii)- and (iii)-class hypotheses). They end up by using their statistical machinery to suggest Voynichese “keywords” – words that, according to their statistical measures, stand out from the text.

Their suggested English keywords (generated from the New Testament) are:-
* begat Pilates talents loaves Herod tares vineyard shall boat demons ve pay sabbath hear whosoever

Their suggested Voynichese keywords (generated from an EVA transcription, though they don’t say which, so possibly Takahashi’s?):-
* cthy qokeedy shedy qokain chor lkaiin qol lchedy sho qokaiin olkeedy qokal qotain dchor otedy

OK, but… what do I think? First off, I’m pleased to see that their results seem incompatible with “shuffled texts” or randomized texts, because that is what nearly all of the various Voynich “hoax” hypotheses rely on. Intuitively, just about anyone who has worked with Voynichese for any period of time is struck by its intense internal structuring on many levels: so it is nice to see the same result coming out from a different angle.

Secondly, what they mean by “mostly compatible” is that while Voynichese passes many of their proposed tests comfortably, it actually fails some of them (and only passes others by the slimmest of whiskers). To me, that implies either (a) an exotically- (and non-obviously-)structured language or constructed lanaguge, or (b) an obfuscated language (e.g. a ciphertext or shorthand): conversely, it seems to imply that Voynichese isn’t a one-to-one-map of any mainstream language (which is what cryptographers such as Elizebeth Friedman have been saying for years). Yet the earliest constructed language we currently know of was devised at least a century after the Voynich’s vellum dating (and about a century after its earliest marginalia), so we can almost certainly rule that possibility out.

I don’t know: while it’s always good to see people approaching the Voynich Manuscript from a new angle, I can’t help but feel that in just about every instance the Voynich’s author remains at least three or four steps ahead of them. The key paradox of Voynichese revolves around the fact that even though it so resembles a natural language, the way its words work as semantic units fails to do so in quite the same way. So for me, the important thing here is to try to understand the tests that failed, and see what they tell us about how Voynich words don’t work… but that will doubtless take a little time.

As for the suggested keywords: personally, I’d be rather more convinced by their statistical machinery if it had automagically suggested the word “Jesus” rather than “boat” or “vineyard” for the New Testament, so I have to say I’m far from persuaded that their list of Voynich cribs will help us unlock its secrets at all… but you never know, so perhaps let’s give them the benefit of the doubt on this one! 😉

Just the merest hint of a nudge to your collective set of virtual elbows, to remind you that the first Voynich London pub meet for basically ages is this evening (7th March 2013), at The Prospect of Whitby in Wapping. Though having said that, all cipher mysteries are fair game, not just the Voynich Manuscript: hence cipher pigeon fanciers and armchair treasure hunters are more than welcome to come along too. Plenty of room for everyone!

I’ll be there from 6.15pm or so, hoping to catch up on the latest Euro cipher gossip from Gotha and elsewhere, courtesy of Herr Cipher Skeptic himself, Klaus Schmeh, who’s on a flying visit to London having had a swift peek at the various enciphered books in the British Library (“The Subtlety of Witches”, etc). So if you can make your way to Wapping Wall for even half an hour, it would be really great to see you.

[Even stronger nudge: Tony Gaffney, what on earth do I have to do to persuade you to come along? I haven’t seen you in 25 years or so!]

Just so you know: if it’s a nice evening (or if someone happens to bring their dog along with them, John 🙂 ), the chances are we’ll be located in the terraced area through the pub to the back left (looking out over the Thames). Otherwise, we could be anywhere on the pub’s two floors, depending on how busy it happens to be. Looking forward to it!

Moshe Rubin just emailed me to let me know that his extensive October 2011 Cryptologia article “John F. Byrne’s Chaocipher Revealed: An Historical and Technical Appraisal” (vol. 35 issue 4, pp.328-379 [!!!]) can currently be viewed and downloaded for free from Taylor & Francis (who publish Cryptologia), via the “Download full text” button there.

If (like me) you’re into both the social and technical aspects of historical cryptography, it’s a cracking old read, covering both Byrne’s life and his numerous attempts to get the US military to accept his “Chaocipher” invention. Yet Moshe’s article is far from all ra-ra-pro-Byrne stuff: it also makes clear…
* the system’s inherent fragility (because each step changed the state of the two rotors, it suffered from near-worst-case error propagation);
* Byrne’s cryptographic inexperience (the way that he proposed concealing the indicator settings was far from secure); and
* Byrne’s cryptologic naivety (he believed that the flat letter distribution of the ciphertext made it explicitly unbreakable).

If you’ve read Ratcliff’s “Delusions of Intelligence” (a book the GCHQ Historian recommended I read, thanks for that!), you’ll know that this last mindset was precisely what the various German agencies using the Enigma machine suffered from: and if Chaocipher had been extensively used by the Allies in WW2, who’s to say that Hitler’s fragmented array of codebreaking agencies wouldn’t have eventually found a way of breaking into it, just as they did with virtually all the Allies’ low-to-medium-echelon ciphers?

One thing that strikes me most about the whole saga is that even though Byrne (who sometimes wrote under the anagrammatic pseudonym “J. F. Renby”, I was amused to see) seems to have envisaged Chaocipher as an expensive-to-build set of mechanical rotors, I think it is actually very easy to use with two Scrabble alphabets arranged in horizontal rows. (OK, Scrabble wasn’t devised until the 1930s, but my basic point still stands regardless). All the sliding operations (zenith / nadir, etc) then become immediately straightforward, arguably far more so than if you were using a machine to do the same.

Regardless of whether or not Scrabble tiles are the best way to Chaocipherify your plaintext, I’d argue that what sets Byrne’s cryptographic ideas apart most is the way he conceptualized his crypto system in terms that mesh peculiarly well with modern computer science: in fact, it’s quite hard to describe it at all without lapsing into contemporary CompSciSpeak. It’s almost as if Byrne were projecting himself forward into a software world: but then again, one of the chapters of his autobiography was SciFi, so perhaps the future was where he felt most at home! 🙂

If you have been following the coverage here of the recent WW2 cipher pigeon story with more than the bleariest of eyes, you’ll know that I’ve repeatedly speculated whether its “W Stot Sjt” signature might well have actually been written by Serjeant William Stout of the Royal Engineers. Though (as we’ve already seen) he died not long after D-Day, I wondered whether it might be possible to find out more about his story by tracking down surviving members of his family and asking them.

Just before Christmas, I finally managed to get in contact with Stout’s daughter, and asked if she could see if she had a copy of his signature or his handwriting. Delightfully, I received from her this last week a small package containing some wartime photographs of her father, a photograph of his grave taken in 1948, and – most surprisingly of all – a 1940 field service post card (“Army Form A. 2042 / R.A.F. Form No. 1929”). Such postcards contained a list of barely informative sentences (“I am quite well”, etc), out of which the sender crossed all those lines that did not apply: there’s an example online here.

Aha, I thought: will the signature pencilled on it turn out to match the signature on the pigeon cipher form? After some lightweight image processing, I placed the two side by side so as to compare them as reliably as I could…

w-stout-signature-comparison-small

You’ve worked out the answer already, I think: which is that the two names were clearly not written by the same person. Which is a shame: but despite not being a proof, it’s still very far from a disproof. In the busy fog of war, a message could easily have been written by one person (the sender), enciphered and/or copied by a second (the signaller), and then sent by pigeon by a third (the pigeon handler).

In fact, various historians have already commented to me that they thought it quite unlikely that a Serjeant in the R.E. would have had the responsibility (or even the practical means) for enciphering a message in the field. So the fact that our enciphered pigeon message was not written by Serjeant Stout might arguably make more sense than if it had been… but it’s hard to be sure either way.

All the same, it has to be said that the best cipher mysteries tend to yield their secrets slowly (at best): so perhaps we shall have to resign ourselves to waiting a little longer yet for a pigeony breakthrough… we shall see!