(Please excuse the impersonality of what follows, but so many linguistic Voynich theories are popping up at the moment that responding to them all individually would be an even greater waste of my life than trawling through their sad attempts at ‘research’: sorry, hope you understand, etc.)

Dear linguistic Voynich theorist,

Thank you so much for your fascinating [1] and generally well-researched [2] paper. Unfortunately, it seems that in your enthusiasm to publish [3], you may well have skipped past some important details that would have presented your evidence, reasoning, and conclusions in a somewhat different light.

For example, your literature review somehow omitted to mention any of the fifty-plus [4] linguistic Voynich theories that had been published previously: only the most eagle-eyed of barristers would be able to highlight how these differed from yours to any significant degree.

I was interested to note [5] that you repeated the late Stephen Bax’s opinion (perhaps without even knowing that he was the source) that it is OK for linguistic Voynich theorists to disregard all previous statistical and analytical work carried out on the Voynich Manuscript’s text. However, given that almost all of that evidence and observation runs directly counter to your linguistic Voynich theory (and indeed Bax’s as well), it is hard not to draw the conclusion that you have been more than a little [6] selective. By stepping past all the practical difficulties with reconciling Voynichese with natural languages that have been pointed out from 1950s onwards by the Friedmans and many others, it seems as though you have taken a particularly blinkered view of the challenge involved.

As to what you think comprises evidence that supports your particular linguistic reading, I’m sorry to have to point out that neither optimistically plucking words from all manner of dictionaries nor running your fragmentary and non-grammatical [7] output through Google Translate for validation constitutes ‘evidence’ in any normal sense of the word. Instead, these merely show that you are willing to throw darts at a map bindfolded and then claim to have invented the satnav. [8]

Your attempted argument as to how Voynichese’s word-forms structurally map onto the plaintext forms you highlight would have been more persuasive had you looked for evidence beyond the two or three pages from the Voynich Manuscript you restricted your attention to. In reality, had you done so you would have realized that the ‘language’ apparently employed in the Voynich Manuscript varies significantly between sections, between bifolios, and also between different page and line positions (line-initial, word-initial, word-final, line-final, labels, etc): and it turns out that the tiny subset into which you put your time is not at all representative of the rest. So your supposed ‘translation’ fails to scale up in any way at all.

Finally: given that in your paper you were unable to sustain your ‘translation’ of the (supposed) plaintext language(s) of the Voynich Manuscript beyond a handful of somewhat optimistic [9] readings, and that this is almost exactly the same level of (un)convincingness that other near-identical linguistic Voynich theories manage, it is hard [10] to feel persuaded by your claims that you have “finally peeled back the veils of secrecy on this most mysterious of manuscripts“. Instead, it seems overwhelmingly likely that you have fallen headlong into the same shallow logical traps as pretty much every linguistic Voynich theorist ever.

At this point, it would be a wonderful thing to be able to say that despite some methodological flaws and over-enthusiastic leaps to conclusion, your paper has still managed to advance our knowledge of the Voynich Manuscript. But this would not be entirely true. [11] Instead, all you have actually achieved is wasting your own time along with that of everyone else unfortunate enough to read your miserable offering: ultimately, your paper is a bland and tepid mix of pseudohistory, pseudoscience and pseudolinguistics that moves us all backwards rather than forwards in any perceivable way.

Sorry, hope you don’t mind too much, best wishes, etc, Nick

Notes:

[1] This is a lie.
[2] This is a bigger lie.
[3] i.e. “slapdash haste”.
[4] Perhaps even a hundred.
[5] This is an even bigger lie.
[6] OK, “obscenely”
[7] OK, “pathetically nonsensical”
[8] It’s a good job I toned this sentence down, the first draft was a bit too strong.
[9] OK, “laughable and utterly random”
[10] OK, “so close to impossible as to make no practical difference”
[11] In fact, this would be a lie big enough to blot out the sun.

Thanks to Newsweek, Fox News, The Daily Mail and The Independent [*sigh*], some techy Canadian Voynich research is currently enjoying its day in the media sun. (Hint to authors: sorry, but based on recent evidence, it would seem that you have ~48 hours to get your next funding request submitted and approved before everyone currently cheering starts booing.)

CompSci professor Greg Kondrak and graduate student Bradley Hauer presented their research at the 2017 ACL conference, and their paper “Decoding Anagrammed Texts Written in an Unknown Language and Script” appeared in Transactions of the Association for Computational Linguistics Volume 4, Issue 1, pages 75–86 [though the PDF is freely downloadable, at least for now].

From the press coverage so far, you might think that they had CARMELed the Voynich (i.e. thrown a tame supercomputer and some kind clever-arse AI libraries at the problem): for, as the media incessantly repeat at the moment, All Human Problems Will Inevitably Yield To The Scythed Mega-Bulldozer That Is AI. But… is any of that true? Or useful? What’s actually going on here?

Behind the Kondrak and Hauer headlines

The initial question is obvious: what did Kondrak and Hauer actually do to try to crack the Voynich’s mysterious secrets that (they thought) nobody else had tried before? A quick snoop reveals that Bradley Hauer is a pretty smart crypto cookie: the simple substitution cipher solver presented in his 2014 paper “Solving Substitution Ciphers with Combined Language Models” outperforms many competing academic solutions. It does this by using both letter statistics and word lists at the same time (a) to solve Aristocrat cryptograms (i.e. ones where you know where the word boundaries are) even under mildly noisy conditions, and (b) to solve Patristocrat cryptograms (i.e. ciphertexts without spaces, though the recursive approach used to turn Patristocrats into candidate Aristocrats seems somewhat heavy-handed), before finally moving on to trying (unsuccessfully) to reproduce the kind of deniable encryption loosely proposed in Stanislaw Lem’s (1973) “Memoirs found in a bathtub”.

And here’s what Hauer looks like in real life:

So what happened before the Voynich paper was even written was that Hauer had built up a lot of software machinery for solving nicely-word-boundaried simple substitution ciphers at speed, and where some kind of mild text mangling had optionally taken place. And so it should not be a surprise that he carried this technology and approach forward, insofar as the 2016 paper tries to solve Voynichese as if it were a nicely-word-boundaried simple substitution cipher that had had its text mangled via anagramming plus optional abjad-style vowel removal. Given that as the paper’s founding presumption, all it is trying to do is evaluate which plaintext language was used if that entire presumption just happened to be correct (oh, and the transcription used was accurate).

Incidentally, the Voynich corpus used was 43 pages (“17,597 words and 95,465 characters”) of Currier-B text in the Currier transcription that one or both of Knight & Reddy had supplied, but the authors did not seem to have questioned the reliability or parsing choices behind that particular transcription. (More on this below.)

Voynich anagramming

Unlike Stephen Bax’s well-known Voynich 2014 paper (which began by gleefully flipping the bird at nearly all previous Voynich research), Kondrak and Hauer’s Voynich paper begins by covering what they consider related Voynich work (section 2.1) in a level-headed, if somewhat brief, way. The most relevant source they have for the notion that we might be looking at anagrammed text is Gordon Rugg’s 2004 paper: this floated the idea that there might be a similarity between alphabetically ordered anagrams (‘alphagrams’) and what we see in the Voynich Manuscript’s text.

Yet much has already been written about Voynich anagramming beyond this, not least William Romaine Newbold’s monstrously tangled ‘decryption’ (*shudder*). More recently, Edith Sherwood claimed both that it was a young Leonardo da Vinci who wrote the Voynich Manuscript, and that the Voynich text was written in anagrammed Italian (though so far she has mainly only tried to reconstruct Voynich plant names using her proposed scheme). As I pointed out in 2009 this seems extraordinarily unlikely to work in the way she proposes.

Arguably the most interesting previous Voynich research into anagrams (again, not mentioned by Hauer) has been that of London-based researcher and translator Philip Neal. In a (now long-lost) page he posted many years ago on the late Glen Claston’s voynichcentral.com website, Philip proposed:

Here is a transformation of plaintext into ciphertext which explains certain features of the Voynich “language”.

1. Divide a plaintext into lines
2. Sort the words of each line into alphabetical order
3. Sort the letters of each word into alphabetical order

1. one thing led to another thing last night
2. another last led night one to thing thing
3. aehnort alst del ghint eno ot ghint ghint

The result has some of the statistical properties of the Voynich text.

A. The frequency distribution of words and letters is the same as in the natural language plaintext, but the distribution of two-letter groups and two-word groups is significantly altered.
B. Words at the beginning of a ciphertext line tend to start with letters at the beginning of the alphabet. Compare the high frequency of Voynich “d” at the beginning of a line.
C. If a letter near the end of the alphabet has a tendency to be word-initial in the plaintext (e.g. German “w”), it will have a strong tendency to be the last word in a line. Compare the high frequency of Voynich “m” at the end of a line.
D. The ciphertext versions of frequent words will tend to cluster together in a line. That is, where a word such as “thing” occurs twice in the plaintext line (as in the above example) the two word sequence “ighnt ighnt” will occur, but “ighnt” may also occur elsewhere in the line as an anagram of “night”.
E. A one-letter word of ciphertext can only be an anagram of a single word of plaintext (“a” can only be an anagram of “a”) and a two-letter word of ciphertext can only be an anagram of two possible words of plaintext (“et” can only be an anagram of “et” and “te”). This means that you cannot have a ciphertext line of the pattern “… i … i … ” or of the pattern “… et … et … et …”. This principle largely holds good in the Voynich text: there are only six exceptions in the corpus of Currier’s language B.

To his credit, Philip then immediately pointed out some problems with this suggestion:

1. Voynichese words do not conform to a strict alphabetical ordering of letters (there are quite a lot of words of the pattern dshedy).
2. Voynichese words have a strong tendency to contain only one instance of a given letter, unlike any obvious candidate language for the plaintext.
3. The enciphering described is not unambiguously reversible (however I think it would work as a private aide-memoire, or as a means of establishing priority like Galileo’s well known anagram announcing his discovery of the phases of Venus)

(Philip has since instead proposed a possible grid-like constraint on the position of Voynichese letters within Voynichese ‘words’, though problems with that alternative explanation remain.)

Incidentally, Philip has also pointed to a number of places within the Voynich Manuscript where entire lines appear to have been written in a non-one-after-the-other way (i.e. unexpected line transpositions): while nobody has yet come up with a powerfully convincing explanation for the presence of “Neal keys” (sections of text typically delimited by pairs of single-leg gallows) in the top lines of pages (typically embedded ⅔ of the way across). He is a sharp observer, and these anomalies are all inconsistent with the widely-held presumption that the text we are looking at here is completely unmangled.

Ultimately, though, it remains a sizeable step (or three, or indeed more) to go from anywhere here to Hauer’s presumption that what we are looking at is straightforwardly anagrammed text in a conventional European language, whether abjad or not.

The actual Voynich research gap

If asked for the single largest methodological problem with Voynich research, I would point to the way that Voynich researchers tend to make a series of unfounded assumptions:
(a) the transcription they are using is perfectly reliable;
(b) the way that they parse that transcription (i.e. into tokens) is correct – there are many hidden linkages here which are each probably sufficient to derail any decryption attempt;
(c) the candidate plaintext languages they consider are genuinely representative of the Voynich Manuscript’s plaintext;
(d) no other textual transformations are present;
(e) the putative hypothetical transformation that they just happen to have plucked from the air and which they are testing is precisely that which is present in the Voynich Manuscript; and
(f) the output of their reverse transformation will be straightforward text that can be read and marvelled over by historians.

In the case of Kondrak and Hauer, I hope it should be clear that they have fallen foul of every one of these issues in turn: and their paper is all the worse for it. It is one thing to note in passing that Esperanto’s “extreme morphological regularity […] yields an unusual bigram character language model which fits the repetitive nature of the VMS words” (p.83), but it would be quite another to point out that this might easily have arisen from the way that Voynichese needs to be parsed in order for it to make sense: and it is this apparent lack of perception of the practical difficulties that all Voynich decryptors face that devalues the genuinely good work that went into their paper.

What particularly frustrates me is that in spite of these many issues, there are plenty of ways Voynich researchers can make genuine progress towards understanding what is going on: but, rather, they instead persist in trying to airball their own personal Voynich match-winner from the other end of the basketball court. They seem seduced by the glamour of being The One Who Solved The Voynich, instead of getting on with the graft of making a difference to what we know. 🙁

Yet computational linguistics has such a rich toolbox (of which CARMEL is merely one small screwdriver) that it surely has ample capacity to at least try to bridge all the actual research gaps that people are falling into, e.g.:

* What is the right way to parse EVA into tokens? (e.g. is EVA ‘or’ two tokens or one? is EVA ‘cth’ three tokens, two tokens, or one? etc)
* How does Currier A map to Currier B? And what about all the subtypes of each of these?
* What are the differences between them and “Currier C”? (Rene Zandbergen’s term for labelese)
* Can we determine whether line-initial letters are likely reliable or unreliable?
* Are words abbreviated (e.g. is EVA y some kind of truncation symbol)? If so, are A and B abbreviated in exactly the same way?
* etc

If people had the intellectual good sense to stop trying to fly over all these separate hurdles all at the same time in a Steve Austin-style 100m leap of misplaced faith, we might start to make real progress. However, even when researchers do have the necessary brains to make progress (as Hauer clearly has), it seems they have insufficient strength of mind to not be tempted by the glamour of the big ticket “Researchers Crack Voynich Manuscript” headline. 🙁

Because last year’s Voynich research brought me a step closer to German calendars, I finally got round to reading Ernst Zinner’s epic “Regiomontanus: His Life and Works” (or, rather, to Ezra Brown’s 1990 translation of the same).

On the one hand, Zinner is crushingly magisterial, in both tone and deep attention to detail (though the appendices by other scholars do sometimes highlight Zinner’s occasional dependence on others’ unreliable translations). Yet on the other hand, it is a style of writing characterized by what is clearly a passionate drive to understand Regiomontanus within his cultural, scientific, and mathematical context.

Say what you like about Zinner, but you could never accuse him of skimming the surface of the subject: as the old joke goes, he’s definitely more bacon than eggs (i.e. where “the chicken is involved but the pig is committed”).

Peuerbach and nocturnals

I’ve written a number of times before about how I strongly suspect that the circular drawing on f57v had a ring of letter groups that was originally made up of 4 x 18 symbols, but where the first two letter shapes of each set of eighteen were subsequently joined together into odd gallows-like characters to turn the sequence into a (far more mysterious) 4 x 17 letter-group sequence. Quite why the author did that is not known, but it would seem to me to have been done to conceal the extrinsic 4 x 18 structure.

Why, you may ask, would a 4 x 18 ring be a giveaway? This, in my opinion, is because 360 degrees / (4 x 18) = 5 degrees, which is the kind of explicit marking you would see on an astrolabe-like instrument of some sort. And because one of the secret astrolabe-like instruments of the mid-fifteenth century was the “nocturnal”, “nocturlabe” or “stardial” (astrolabes themselves were hardly secret by that time), I have long wondered whether what was depicted here was this specific instrument.

And so I was fascinated to read the following brief note in Zinner (pp.26-26), in his discussion of Georg Peuerbach (Regiomontanus’ mentor and teacher):

From 1455 on, there existed “stardials” [nocturnals] to tell time at night by means of the Pole Star and two stars {the “pointers”} in Ursa Major.

The footnote reference Zinner gives for this is: “Ernst von Bassermann-Jordan. Uhren. Berlin, 1922. Figure on page 21.” And so I went off (eventually) to have a look for this specific edition.

It turns out that von Bassermann-Jordan was a well-known clock and watch collector, whose classic book (“Uhren”) on the subject was reprinted many times. I was therefore delighted to find out that I could order a relatively cheap print-on-demand copy of the exact same 1922 version that Zinner referenced, and speedily sent off my money.

I must admit to having been a little bit surprised when a padded envelope appeared covered in Indian stamps (I must admit to having wondered whether it was a coincidence that “Uhren” was an anagram of “Nehru”), but that’s globalization for you.

Even though the print-on-demand book cover was really quite nice, the quality of the scans inside was unfortunately more than a little disappointing in places. But even so, I could now try to find what von Bassermann-Jordan said on page 21 that Zinner remarked upon. Sadly, this was one of the many places where the scans became somewhat unrecognizable by the right-hand edge.

Yet because I was able to use Google to search for “Orientierung der Horizontalsonnenuhren” on the same page, this yielded three hits on archive.org, including the 1922 edition I had just bought. (There was also a 1914 edition and a 1920 edition). Unsurprisingly, the 1922 edition was (without any real doubt) the source of the somewhat mangled scans for the Indian POD company, so I can show you what I was trying to read, direct from the source:

Even so, this meant that I was able to find the same thing in the (much clearer, and significantly more readable) 1920 edition of “Uhren”, so that you can hopefully see the structure of the nocturnal (dated “1456”) much more clearly:

I presume that this is referring to the Bayerisches Nationalmuseum in Munich, which (encouragingly) has a collection of scientific instruments. However, I wasn’t able to find the one depicted in its object database (and I suspect Zinner wasn’t able to find it either), but perhaps one of my German readers will have more luck than me. 🙂

Other nocturnals

A very good source for the history of the nocturnal is Günther Oestmann’s (2001) “On The History of the Nocturnal“. Oestmann notes that the idea that the nocturnal was first invented in China has been comprehensively debunked, and that it is instead a European invention – there are no Arabic nocturnals from the Middle Ages. There was also a predecessor to the nocturnal (a sighting tube and disk, described by Pacificus in the 10th century), but the more compact hand-held nocturnal was clearly a far more usable version of the same thing.

The “V2.0” idea of nocturnals had actually been discussed as early as the 12th century (though few seem to have been actually built). Raymon Lull mentions the nocturnal in his Nova geometria (1299), and the device was made famous by Peter Apian’s 16th century printed book:

But documentation on nocturnals between Lull and the 1456 nocturnal noted by von Bassermann-Jordan and Zinner seems quite thin. Oestmann lists all the 15th century nocturnal manuscripts he is aware of, together with references (not included here) to where they are mentioned in Zinner’s (1925) “Verzeichnis der astronomischen Handschriften des deutschen Kulturgebietes“:

* Wolfenbüttel, HAB: Cod. Guelf. 81.26 Aug. 2°, fol. 144v (Use of the Nocturnal, Latin 1461)
* Göttingen, UB: 2° Philos. 42m, fol. 55r/v .00 (Johann v. Gmunden [?], Construction of the Nocturnal, in a collection of astronomical texts, 15th cent.)
* Würzburg, UB: M. ch. q. 132, fol. 153v-154v, 155v (Construction of the Nocturnal, Latin, in a collection of astronomical texts, late 15th cent.)
* Leipzig, UB: Cod. 1469, fol. 201r-207r (Construction of the Nocturnal, Latin, in a collection of astronomical texis, 14-15th cent.)
* Munich, Bayer. StB: Clm 24105, fol.65v-67r (Use of the Nocturnal, German, in a collection of astronomical texts, 15/16th cent.)
* Munich, Bayer. StB: Clm 214, fol. 167r-185v (Construction of the Nocturnal, Latin, in a collection of astronomical tables, 15th cent.)
* Bern, Burgerbibl.: Cod. 157 fol. 27v-28v (Construction and Use of the Nocturnal, Latin, 15th cent.)
* Zürich, Zentralbibl.: Ms. C 107, fol. 107r/v (Wilhelm Hofer, Carthusian monk and pupil of Georg Peuerbach), Construction and Use of the Nocturnal, Latin, Gaming (Lower Austria) 1472/79); see L. C. Mohlberg, Mittelalterliche Handschriften (= Katalog der Zentralbibliothek Zürich, vol. I), Zürich 1951, p.55f., 361).
* Ottobeuren, Klosterbibl.: Ms. II, 319, fol. 122-123 (Construction of the Nocturnal, 15th cent.)
* Meiningen, Landesbibl.: Pd 32.44, fol. 92r-98v, 104r-105v (Construction of the Nocturnal, 15th cent.)

Oestmann also mentions Chartres MS 214 (olim 173; destroyed in 1944), which looked like this, though note that it actually belongs to the earlier “sighting tube” tradition from Pacificus:

But because Oestmann relies on Zinner, who in turn was only looking at astronomical manuscripts within the German cultural orbit, there are doubtless many more to be found. For example, there’s a nice volvelle nocturnal in fol. 25r of MS Ashmole 370, an English manuscript dated ~1424 that I’d really like to see the rest of one day:

If anyone is aware of a better / more recent / more pan-European source on the 12th-15th century history of the nocturnal / nocturlabe than Oestmann, please let me know!

“Stretched out arms”

For Voynich researchers, I would argue that the single most extraordinarily interesting paragraph in Oestmann’s paper is as follows:

Closely connected with the history of the nocturnal are certain diagrams in nautical texts, which served as mnemonic devices for the correction of the measured altitude of the Pole Star. The position of α and β Ursae minoris respectively α and β Ursae maioris, often called ‘Guards’, indicated the correction to reduce the observed Pole Star altitude to obtain latitude. A man with stretched out arms standing in the Pole was used. If the Guards were found over his head the Pole Star stood 3°5′ under the Celestial Pole and vice versa. The two arms marked the side deviations and also intermediate positions were recorded in mnemonic verses.

Given that f57v specifically depicts people with stretched out arms (at the top and the bottom), I suspect that this is an avenue of research that is well worth pursuing further:

Oestmann’s footnote #17 gives as his source:

See for example the Regimento do Norte (probably composed in the 15th century; München, Bayer, StB: 4° Inc., 1551). On the nautical ‘Regiments’ see Joaquim Bensaude, L’astronomie nautique au Portugal à l’époque des grandes de couvertes (Bern, 1912/17, repr. Amterdam 1967), pp. 136-145, 223f.; Hermann Wagner ‘Die Entwicklung der wissenschaftlichen Nautik im Beginn des Zeitalters der Entdeckungen nach neuern Anschauungen’, Annalen der Hydrographie und Maritimen Meteorologie, 46 (1918), pp. 215-220.

I’ve noticed a number of things about the Paul Rubin ciphertext, which are definitely beyond what the FBI’s cryptanalysts managed to find. Please feel free to advance these yet further!

The Paul Rubin Ciphertext

Here’s my transcription, as posted on the Cipher Foundation website, but with the individual lines numbered:

[01] digIs sawthn'g mathUlley-Dulles crancklavn' meteore iElli
[02] zheaopfvamn greA'Lltenmn
[03] kKiqtu albawmnabs dzhjellEiE matel ungdreabozvmie oie
[04] sprekln meIktrene fodroscolmn oeir
[05] *driEk Conant astereantol Iyvondiolon
[06] desceth megleagna mAlzbourgnion grele

[07] newtdo sfoatzdexklagh 2pont ¼ly asgestaltverbensdi

[08] 7469921
[09] 100.011x100.10x.10011.1.xx0.101.x.001011.101x1011.1001..10x1

[10] 01.001011x10.1x.11101.x1.001x1.001001

[11] 0.101.x.101110.x101.1101101.0101x1.1011

[12] Want: datum Tywood Janossey Ketelle

[13] R-QR6
[14]                aliacaui PER

The Last Two Words

It seems almost certain that “PER” stands for “Paul Emanuel Rubin”, which gives us a reasonable amount of confidence that this is a real ciphertext Rubin himself had made.

Rubin liked reading science fiction, and repeatedly claimed to be a member of a “Brooklyn Astrophysics Society”: however, despite asking a lot of people, the FBI were unable to turn up any reference to any such society – Rubin seems to have dreamed this up completely. So there is (I think) good reason to suspect that we might find obscure references to science fiction and/or imagined science embedded in this ciphertext.

And in fact this is exactly what we find on line [14]. The word immediately before PER – “aliacaui” – comes from a 1950 novelette by Poul Anderson called “The Helping Hand“, that first appeared in Astounding Science Fiction:

Valka Vahino sat in his garden and let sunlight wash over his bare skin. It was not often, these days, that he got a chance to aliacaui… What was that old Terrestrial word? “Siesta”? But that was wrong. A resting Cundaloan didn’t sleep in the afternoon. He sat or lay outdoors, with the sun soaking into his bones or a warm rain like a benediction over him, and he let his thoughts run free. Solarians called that daydreaming, but it wasn’t, it was, well — they had no real word for it. Psychic recreation was a clumsy term, and the Solarians never understood.

In the story, “Aliacaui” is a central concept to the Cundaloans, one which the hard-working Solarians (machine-culture humans) find difficult to grasp:

“For instance, just this matter of the siesta. Right now, all through this time zone on the planet, hardly a wheel is turning, hardly a machine is tended, hardly a man is at his work. They’re all lying in the sun making poems or humming songs or just drowsing. There’s a whole civilization to be built, Vahino! There are plantations, mines, factories, cities abuilding — you just can’t do it on a four-hour working day.”

So this is the single word from the Science Fiction universe that Paul Rubin valued so much that he used it as a farewell in his note: “Aliacaui”, meaning lying around “making poems or humming songs or just drowsing”. Really, you couldn’t make it up.

The Codebook Indices

I’d also immediately point out that lines [01]-[07] almost certainly each use a different codebook entry, and that the seven codebook indices for them are on line [08] – 7469921. It seems likely (though not yet proven) that these are the indices in the order the lines appear.

Furthermore, lines [09]-[12] each almost certainly use a different codebook entry, and that the four codebook indices for these are on line [13] – R-QR6.

Hence, the three “binary”-like lines would seem to be codebook entries “R-QR”, while all the other lines’ codebook indices are digits: 7469921 and 6. I pointed this out to Craig Bauer before his book came out, but it was (alas) too late to update his Paul Rubin chapter.

Still, knowing this should help us avoid many pointless cryptanalytical tests: for example, there would seem to be no point in carrying out any test that combines a binary-like line with a text-like line, because they would seem to be using completely different types of codebook.

The Three Binary-Like Lines

If the 0s and 1s in these lines are binary, I noticed something a little odd about them: specifically, that they contain are no instances of ‘000’, and only two instances of ‘111’.

I therefore wondered whether the ‘.’ and ‘x’ character be standing in for ‘000’ or ‘0000’, and ‘111’ or ‘1111’? I converted various permutations for line [09] (the longest binary-like line) into the corresponding streams of pure binary digits, and then ran them through index of coincidence tests online.

[09] 100.011x100.10x.10011.1.xx0.101.x.001011.101x1011.1001..10x1

However, I don’t have a positive result yet (the IoC probably isn’t the most reliable test for this kind of thing, but I thought it was worth a try), but if you happen to be looking at this part of the ciphertext, I think this currently seems like the most likely route to an answer.

The Letter Ciphers

According to Cipher Mysteries commenter Thomas, “Conant and B.H. Ketelle were members of the Manhattan Project. Janossy was a Hungarian nuclear physicist at the same time. Tywood is a professor of Nuclear Physics in Isaac Asimov’s short story ‘The Red Queen’s Race’ from 1949.”

Indeed, Asimov describes Elmer (Pop) Tywood in his short story as “Ph.D., Sc.D., Fellow of This and Honorary That, one-time youthful participant of the original Manhattan Project, and now full Professor of Nuclear Physics.”

Unfortunately, “Elmer Tywood was dead. He lay next to the table; his face congested, nearly black. No radiation effect. No external force of any sort. The doctor said apoplexy. […] In Elmer Tywood’s office safe were found two puzzling items: i.e. twenty foolscap sheets of apparent mathematics, and a bound folio in a foreign language which turned out to be Greek, the subject matter, on translation, turning out to be chemistry.”

Mysterious deaths and Manhattan Project physicists were therefore at the forefront of Paul Rubin’s mind: my suspicion is therefore that Rubin’s book of words that would drive his code project may well turn out to be a list of names of members of his imaginary Brooklyn Astrophysics Society. We’ll probably never see it, of course: but it is what it is.

Back in 2015, I blogged about the ciphertext that was found taped to Paul Rubin’s stomach, and also wondered whether Rubin might have suffered from paranoid schizophrena, and whether the FBI would ever release his code-tables. On a larger scale, given that we only had a single scratchy newspaper photograph of his cipher to work with, it seemed that we were unlikely to make huge progress.

Well, a lot of that has now changed.

Craig Bauer’s “Unsolved!”

Craig Bauer’s (2017) “Unsolved!” covers a good number of cipher mysteries with a particular focus on Americana, and so his book covers the Paul Rubin case on pp.289-304. Very kindly, he passed me the following (much clearer) scan to work with, that his book had reproduced at a fairly small size:

Hence I’ve added a page on the Paul Rubin Cipher to the Cipher Foundation website: this also includes my transcription of the cipher, as well as (thanks to Albert Mock) a copy of Rubin’s death certificate and details of his grave.

Arguably even more importantly, however, Craig Bauer also received a 160-page document set from the FBI following a Freedom of Information Act request (though this arrived too late for him to use in his book), which is now also linked on the Cipher Foundation webpage – and this is where our real research begins.

Paul Rubin’s FBI File

A number of suggestions as to the possible contents of the cipher appear in the FBI file: that it might be written in cable language, that it might contain cribs for a chemistry examination, and so forth. It also lists (p.25) possibly the first nutty theory about the cipher, courtesy of Mrs L. Rohe Walker, 2 Beekman Place, New York 22, NY.

For me, though, one particularly interesting aspect of the file is that it details (on p.66) the specific sequence that the FBI’s cryptanalysts followed to try to understand Rubin’s ciphertext (though without success). It starts by listing the languages FBI linguists tried: “Yiddish, Hebrew, Russian, German, Hungarian, Finnish, Latinair [?], Lettuishuan [?], Turkish, Portuguese, Rumanian, Italian, Spanish, Dutch, Danish, Norwegian, Swedish, Malayan, Albanian.” They also “[t]ried to develop words phonetically and as abbreviations, with no success.”.

It then moves onto cryptanalysis, firstly listing the “Direct Cipher” methods they looked for (all with negative results):

a. Monoalphabetic Subst.
b. Transposition – uniliteral
c. Partial encipherment with & without nulls
d. Typewriter displacement
e. Combination subst + Transp + nulls + partial encipherment
f. Commercial word codes
g. False language
( 3 books from Library of Congress:
( * “On the Choice of a Common Language”
( * “Method [?] to Esperanto”
( * “A Planned Auxiliary Language”
h. Binary substitution as superencipherment

It then lists the “Open Codes” they looked for, firstly for letters mapped across the whole specimen:

(1) Constant key positions 2-25 (Entire specimen)
(2) 1st, 2nd, 3rd …. final letters of words, initial letters – forward and reversed alphabets
(3) Beginnings & endings of lines (+10 and -10 letters)
(4) Numerical key 7469921 as letter positions
(5) Capital letters, including letters to right & left

Next they looked for words embedded in the entire specimen:

(1) Constant key positions
Also 1st, 2nd letters of constant keyed words 2-10

They moved on to search for possible “Distorted Words”:

2PONT = Dupont
1/4ly = Quarterly
AS(GESTALT)VERBENSDI
KETELLE = Catelle (psychologist) ?

They also looked for possible names, cross-referencing them in the FBI files (e.g. for B. H. KETELLE):

TYWOOD-JANOSSEY-KETELLE [lists FBI document references]
IVAN DIOLON (negative)

Next, they list various observations made by the cryptanalysts. This for me is the most interesting part by far (note that ‘Q5’ is the reference names for this ciphertext within the bundle of evidence made available to the FBI):

1. Peculiar letter combinations of contacts.
a. Phonetic & pronounceable but unintelligible. Hebrew/Yiddish influence observable.
b. Doubled letters – with “KK” at beginning of one word.
c. “MN” digram, occurring 3 times at end of words.

2. General letter frequency:
A = 21, B = 5, C = 4, D = 11, E = 35, F = 3, G = 10, H = 6, I = 15, J = 1, K = 7, L = 23, M = 10
N = 21, O = 17, P = 3, Q = 1, R = 13, S = 12, T = 15, U = 5, V = 5, W = 3, X = 1, Y = 3, Z = 5
2 = 1

3. “Digits” at bottom of Q5 not in organized form. “X” and “.” appear haphazardly. Makes no sense in attempts to convert binary to digits.

CONCLUSIONS:
1. Letter frequencies and pronounceability indicate phonetic composition.
a. May be syllable, phonetic, or artificial word code. If so, material is insufficient for analysis.
b. letter material looks like irrational or unsystematic composition

2. Digits have irrational composition. Positions of “X” and “.” do not assist solution. Concentrated efforts on “Binary”, produced no significant results. If digit code on cipher underlying binary, material is insufficient.

Is It A Real Cipher?

On the one hand, the parents “did state, however, that there were a number of papers of a similar nature in their home in Brooklyn. They stated also […] that Benjamin Birnbaum […] would undoubtedly be able to furnish information concerning the ‘coded note.’ ” Later: “Mrs Rubin stated that her son and his friend, Benjamin Birnbaum, often exchanged “notes” similar to the one found taped to the subject’s abdomen when he was found.” (p.17). Yet “BIRNBAUM advised that he never sent to nor received a coded message from deceased” (p.29).

Rubin was also fascinated by binary. Birnbaum “stated that the deceased talked of using a word unit code with numbers for each word. The numbers would then be transmitted into the binary code. BIRNBAUM advised that the binary code is a code used on all calculating machines. He stated that the deceased was going to use another stage of transmitting this code unknown to BIRNBAUM.” (p.35)

I think it is abundantly clear from the file that Paul Rubin and Bernard Birnbaum did communicate by means of cipher, even if Birnbaum strenuously denied this under interview. Rubin and Birnbaum thought that everyone else was not only stupid, but deserved to be treated as stupid (and said so in the interview): so if Birnbaum treated the FBI as stupid, we should perhaps not be surprised.

In my opinion, the cipher would seem to be completely genuine and that Paul Rubin’s parents and Bernard Birnbaum did initially have access to Paul Rubin’s codebook. However, I strongly suspect that they chose to destroy it rather than give it to the government, lest the ciphertext reveal something unsavoury about the dead student’s end – better for him to die in mystery than in possible ignominy.

“La Buse, l’or maudit des Pirates de l’océan Indien” is a two-part (i.e. 2 x 52 minutes) documentary with fictional re-enactments (you get the idea) made by Kapali Studios, and due for release around January 2019 (so no need to get too excited just yet).

If (like me) you’re a pirate museum trivia fan, you’ll be interested to hear that the film-makers did their talking heads interviews in the Musée de la Marine and the Musée Cognacq-Jay (both in Paris), as well as on “L’Étoile du Roy“, a 46m replica of an 18th century British sixth-rate frigate that is a well-known tourist attraction in Saint-Malo (it was previously used as HMS Indefatigable in the TV series “Hornblower”).

Of course, as a cipher historian who cannot for the life of me see any actual connection between La Buse and “his” cryptogram, there could be no place set for me at that particular table – for realistically, where would the mystery be without the cryptogram? But while I don’t hold out a lot of hope for cryptological accuracy here, I’m sure the production will look beautiful. 🙂

The Eye Candy Bit

There are some nice behind-the-scenes images on the Kapali Studios website which I thought it would be nice to share here:

There were some other images here:

The total of my cipher mystery books purchases for 2017 was £260, which was actually a little lower than recent years (it’s been fairly quiet). For a change this year, I thought I’d list them here in all their eclectic glory.

I’ve lightly annotated each of these cipher mystery books, to cast a little glancing light on the areas of research I’ve been working on. Make of them all what you will!

* The Palaeography of Gothic Manuscript Books: From the Twelfth to the Early Sixteenth Century (Cambridge Studies in Palaeography and Codicology), by Albert Derolez.

Magisterial yet accessible, a really great book on Gothic palaeography. Of course, you then have to try it out in the field for a decade to be any good at it, but… palaeography is what it is. 😉

* A Man of Misconceptions: The Life of an Eccentric in an Age of Change, by John Glassie

Basically, the best-known modern biography of Athanasius Kircher. Perhaps a bit too generous towards its subject in places for my tastes, but it certainly covers all the ground. My focus when I bought this was on the people who carried on Kircher’s legacy, which turned out to be a very small group indeed.

* Regiomontanus: His Life and Work: Volume 1, by Ernst Zinner

Epic, detailed, stunning evocation of Regiomontanus. I’ve long wanted to read this, but until this year (when I found myself wanting to know whether Regiomontanus might have seen Vat. Gr. 1291 when it arrived in Rome), I could never quite justify the cost. Regardless, it turns out that it’s well worth the money – recommended.

* The Secret Code-Breakers of Central Bureau: how Australia’s signals-intelligence network helped win the Pacific War, by David Dufty (ebook).

Nice little book on Australia’s surprising war-time cryptology effort, something that tends to get trampled by gung-ho American cryptology historians. And no, it’s not all about Eric Nave (he actually plays a surprisingly small part in this account).

* Comment ils ont trouvé un trésor, by Alain Cloarec

Fairly lightweight, but helped me understand some of the practicalities of French treasure hunting law. But that’s another story…

* Maps, Mystery and Interpretation: 2. The Mystery: Oak Island Speculation: Volume 2, by G. J. Bath
* Maps, Mystery and Interpretation: 3. Interpretation: Sizing Up the Money Pit: Volume 3, by G. J. Bath
* Anson’s Gold: and the Secret to Captain Kidd’s Charts, by George Edmunds

I reviewed Bath’s books and Edmunds’ book in my blog.

* The Sirius Mystery: New Scientific Evidence for Alien Contact, by Robert K.G. Temple

Cipher Mysteries commenter Astronomical challenged me to read this, to make up my own mind about Temple’s Sirius theories (though on 1st April, so it’s hard to be sure). However… now that it has arrived, I just haven’t been able to get excited enough to actually pick it up, so it’s still waiting patiently on my bookshelf.

* The Templars: The Secret History Revealed, by Barbara Frale

Oh my, what an excellent little book this is. Anyone wanting to read about the Templars should start here. Highly recommended!

* Playing the Numbers: Gambling in Harlem between the Wars, by Shane White, Stephen Garton, Dr. Stephen Robertson, and Graham White

Very interesting book on the subculture of gambling that I touched on in my blog.

* Ella Minnow Pea: A Novel in Letters, by Mark Dunn,

Funky littl novl that msss around with th problms of writing whn crtain lttrs ar not allowd to b usd. 😉

* Generation of Vipers, by Philip Wylie

The book that Paul Rubin was supposed to be a follower of. Interesting (and surprisingly influential) mid-20th century nonsense.

* The Devil’s Chessboard: Allen Dulles, the CIA, and the Rise of America’s Secret Government, by Talbot, David

Another book triggered by the release of the FBI papers concerning Paul Rubin: I wanted to know more about Allen Dulles (whose surname seems to appear in Paul Rubin’s cipher, or is at least in his covertext).

* Sleepwalkers, by Arthur Koestler

A readable (but now rather dated) account of the development of astronomy.

* Humanism, Scholasticism, and the Theology and Preaching of Domenico de’ Domenichi in the Italian Renaissance, by Martin F. Ederer.

I wanted to know more about Domenico de’ Domenichi (who owned Vat. Gr. 1291), and this is probably the best book on the subject out there.

* The Renaissance in Rome, by Charles L. Stinger

I bought this to cast a light on what was going on in Rome circa 1460-1470, where some of my secondary Voynich research paths are now starting to vaguely lead towards.

* French Painting in the Time of Jean de Berry, by Millard Meiss

Splendidly detailed book, but don’t buy it expecting lots of extraordinary pictures, it’s mainly fine-detailed history. 🙂

* Solution of the Voynich Manuscript, by Leo Levitov

I’ve been meaning to buy a copy of this for myself for ages, and finally got round to it. However, I couldn’t bring myself to pony up the far greater amount for Joseph Martin Feely’s “Roger Bacon’s Cipher; The Right Key Found”, so if anyone just happens to have a digital copy of that, please let me know. 😉

* From Magic to Science: Essays on the Scientific Twilight, by Charles Singer.

Though I’ve ordered this, it hasn’t yet arrived. This was prompted by an updated page on Rene Zandbergen’s site which quotes Erwin Panofsky’s thoughts on the Voynich Manuscript in a less abbreviated form than has been the case.

After The History Channel’s recent season of “The Hunt For The Zodiac Killer” programmes (episodes 1, 2, 3, 4 and 5), I thought it was time to get back to some non-fake-news codebreaking research.

In particular, I want to suggest an approach we might follow to try to solve the Z340 that (hopefully) won’t need a brain the size of a planet to run it. But first I’m going to talk about the Z13 cipher, because I think it tells us a lot about what is hidden inside the Z340 and indeed why the Z340 was written at all…

The Z13 Cipher

The text just above the Zodiac Killer’s Z13 cipher (20th April 1970) clearly and unambiguously refers back to a ‘name’ supposedly in the Z340 cipher (8th November 1969), though as far as I can see the “Dripping Pen” note that arrived with the Z340 didn’t mention a name at all:

An oft-repeated account for this is that the Z13 had been constructed in response to a kind of cryptographic ‘taunt’ that appeared six months previously in the Examiner newspaper on 22nd October 1969, as detailed here. In the Examiner piece, entitled “Cipher Expert Dares Zodiac To ‘Tell’ Name“, the President of the American Cryptogram Association issued a direct challenge to the Zodiac Killer to reveal his name in a cipher.

However, if you put all these pieces together, it seems highly likely to me that it was instead the Z340 cipher that had been constructed as a response to President Marsh’s taunt (it appeared a mere seventeen days later). Hence it seems entirely reasonable to conclude that the Z340 indeed contains a specific name for us to decrypt – though, as always, it seems highly unlikely that this will contain the Zodiac Killer’s actual name.

Cryptanalytically, though, the Z13 couldn’t be further removed from the homophonic world of the (cracked) Z408 (and presumably the Z340), in that it has shape repeats and internal structure aplenty. In fact, if you colour all the Z13’s repeated cipher shapes (once again, using Dave Oranchak’s neat-o-rama Cipher Explorer), this is what you see:

Much as I love “Sarah The Horse” and “Clara Cataract” as elegant literary plaintexts for this, it’s important to note that these are homophonic solutions for something whose many repeats point to its actually being a monoalphabetic substitution cipher. Dave Oranchak’s “Laura Catapult”, and glurk’s “Gary Lyle Large” are fine examples of how it is possible to construct name-like phrases to fit: but these are relatively rare examples in a surprisingly sparse, errm, name-space.

In many ways, whereas the problem with the Z340 is that it has too many shapes, the problem with the Z13 is arguably that it has too few shapes. So there would seem to be something a little odd going on here, cryptanalytically speaking: something feels wrong.

In his 2017 book “Unsolved!”, Craig Bauer praised a possible crack of the Z13 cipher which I hadn’t previously heard of, and credits p.128 of Robert Graysmith’s (2002) “Zodiac Unmasked: The Identity of America’s Most Elusive Serial Killer Revealed” as the source (though Graysmith talks about it as if the suggestion were as old as the [Hollywood] Hills):

Now, even though this doesn’t quite fit the pattern (the N cipher shape shouldn’t be shared between plaintext F and M), I think Bauer was completely right to give this his imprimatur, because it seems exceptionally close. Giving MAD Magazine’s “Alfred E. Neuman” as his name feels like this exactly the kind of thing the Zodiac Killer would do, in that it is taunting, unhelpful, superior, nasty, satirical, self-centred, and narcissistic in all the right ways.

For ALFREDENEUMAN to be the Z13’s plaintext, the only concession you would need to make is that a single letter was misenciphered: and as starting points go for a ciphertext that already feels as though it has too few shapes, this is not half as big a step as almost all other solutions I’ve seen proposed. Even though I completely accept that this isn’t cast-iron proof, I do think it suggests that it is well worth considering as a conditional piece of evidence to work with.

And Now, The Z340 Cipher…

For me, the big (if not ‘huge’) question the above leads to is this: if this ALFREDENEUMAN Z13 decryption is actually correct, might the Zodiac Killer have included exactly the same name in his Z340 cipher? And if so, might we be able to use the name as a known-plaintext crib into the Z340? (AKA a block-paradigm match. 🙂 )

Assuming the Z340 does use some kind of homophonic cipher, there are (340 – 12) possible positions the Z13 crib could be positioned at: however, we should be able to eliminate any position containing a cipher shape repeat within the 13-shape stretch that does not match a repeat in the ALFREDENEUMAN crib, because that would mean that the same homophonic cipher shape would have been used to encipher two different plaintext letters.

For example, because Z340 line #4 begins “S99…”, the “99” part could not be any part of the Z13 crib because there are no doubled letters in “ALFREDENEUMAN”: this is also true for the “++” pairs in lines #4, #14, and #18. Similarly, the +..+ repeat on line #9 and the W..W repeat on line #18 both cannot be in the crib, because no plaintext letter is repeated three steps apart in “ALFREDENEUMAN”. If you run this against the most widely used Z340 transcription, there are – according to the vanilla C test I put together (below, which you can actually run for yourself by clicking on [Run]) – exactly 197 valid crib positions. So we can eliminate (340-12-197) = 131 candidate positions. Which is nice. 🙂

What I find interesting is that locking a set of fixed set of letters to an (albeit still hypothetical) crib should enable us to use a homophonic solver on far smaller subsections of the Z340 than we would normally be able to do. I’ve written before about how the top half and the bottom half of the Z340 have quite different (but subtly overlapping) properties: for example, how top-half ‘+’ characters seems to work differently to bottom-half ‘+’ characters. As a result, I think it would make sense to try to solve lines #1 to #9 separately from lines #11 to #19.

But there are other results, that point out how lines #1 to #3 seem to work quite differently from lines #4 to #6, and so on. So the ability to try to solve even smaller blocks of lines may well be a critically useful string for our cryptological bow.

Unfortunately, I’m not (yet) a zkdecrypto-lite power-user, so I don’t know how to automate this kind of search Anyone who would like to collaborate on doing this, please feel free to step forward: or if you want to take the idea and do what you like with it, that’s fine by me too. Can you blame me if I want to see this solved before they start shooting Season #2? 😉

Just One Last Thing…

There is, of course, one other possibility that should be investigated… it’s just that those cold, creepy eyes in the famous Zodiac poster remind me of someone, can’t think who it is, but the name might come to me soon, who was it…?

C: Crib Matching Code

Without any further ado, here’s The History Channel’s “The Hunt for the Zodiac Killer” season #1 finale, wherein Craig Bauer, having immersed himself almost completely in Zodiac Killer arcana, conjures up a new solution of the Z340, whereupon everyone else falls (or seems to fall) in line:

video since removed from TagTele site

Well… OK, I guess. I suspect what most people would agree on about this ‘solution’ are:
* it’s primarily intuitive, and not really ‘cryptological’ in any useful sense of the word
* it’s either really brilliant or really foolish, and almost certainly nowhere inbetween

Craig’s Crack

Because the starting point for Craig Bauer’s decryption attempt was the idea that some letters might actually encipher themselves (to make the answer hide in plain sight), I’ve added a green background to those letters (or simply transformed letters) where the ciphertext and his decrypted text coincide, e.g. “HER……KI.L….” on the topmost line. You should be able to see 23 green-backgrounded letters.

However, for the sake of balance, I’ve also added a red background to those letters (or simply transformed letters) where the two do not coincide, e.g. “…PLVVP….TB.D” on the topmost line. You should be able to see 61 red-backgrounded letters (I think).

To make the following diagram, I used Dave Oranchak’s funky online Cipher Explorer tool:

It should be immediately obvious that a very high degree of selectivity is going on here: furthermore, seven letters are left out (on lines 2, 3, 5, 6 and 7), while three extra letters are inserted (lines 5 and 8). Finally, there is no consistent mapping of other shapes to plaintext letters as per the claimed decrypt, which is why I think it is safe to say that this is not a ‘cryptological’ decryption in any useful sense of the word.

The notion that a given historical ciphertext uses a handful of actual letters as themselves while the rest are somehow illusory or made up is an illusionary amateur cipher-breaking trope I have seen many dozens of times. In every case, it is a Pyrrhic victory of intense hopefulness over good sense, and achieves nothing bar wasting my time. If anyone can point my attention to anything about this particular decryption that varies from this rather self-defeating and useless template, I’d be fascinated to see it: but so far, this is just about as bad as it gets.

The motif of this antipattern is the codebreaker dreaming themselves an intense imaginary journey into the world of the codemaker, and bringing back as their prize a sampling of their vision, one that is every bit as hard to read as a book in a dream. All they have is the enduring conviction that they have solved it, a conviction that gets strengthened the brainier they are (and hence the more ingenious their post-rationalizing retro-fitting gets).

Total Immersion Delusion

If I were to give this kind of behaviour a “Pattern” name, I’d probably choose “Total Immersion Delusion“. Only someone who feels they have totally immersed themselves in their imagined world of the cipher maker would propose such a thing, and in almost every single case it is – sadly – a delusion that gets conjured up.

Here, you can see the seeds of the dream forming in the first line’s “HER…” and “KI.L” word-fragment patterns: but as the dream progressively fades away, the ability of the dreamer to fit the shape to the overselected letters reduces and reduces, until they’re left with only the sketchiest outlines of hope (a single green letter on lines 4, 5 and 7 demonstrates the degree to which it has triumphed over rationality here).

Sorry, but from what I can see, this Z340 ‘solution’ isn’t even close to being close: nobody’s going to come out of this particular dungheap smelling of roses, no matter how hard you hold your nose. Not huge, not a game-changer, sorry.

You may not be aware that there are, out in the world, many private languages – languages that offer speakers and listeners within a particular group or subculture the ability to talk about things that could if said in public cause one or both to be hated, persecuted, prosecuted, or even killed.

Antilanguages

The academic literature sometimes call these antilanguages, a term coined by M.A.K. Halliday in 1976 to describe those secret languages used by (what he also named) “anti-societies”, e.g. prisoners, fairground workers, gay men, lesbians, thieves, Voynich researchers ( 😉 ), etc.

For me, I suspect that pitching them as tools of active resocialization formed within consciously-formed alternatives to the mainstream (as Halliday does) is a little bit overreductive: from my point of view, they are necessary parallel forms of language when the use of mainstream language would be personally problematic.

Polari And Its Sisters

This need for ‘privacy in public’ has led to a large number of cant slangs, such as Polari – this is a very old UK cant slang used by many subcultures (fairground people, Punch & Judy men, gay people, etc). Even though there are two fairly recent books on it (both by Paul Baker), this private language first came to wider public attention (a linguistic paradox if ever I heard one) thanks to the 1960s radio programme “Round the Horne” and its two Polari ‘homy polones’, Julian and Sandy.

Even though Polari seems – in my opinion – to have a closer historical connection to Punch and Judy performers than anything else, it has become best known (thanks to Julian and Sandy, arguably the first “celebrity gay couple”, as one programme put it) as a gay cant slang.

Yet even if it was culturally appropriated in this way, Polari is far from the only gay subculture language. One could quickly point to Bahasa Binan in Indonesia, Gayle and IsiNgqumo in South Africa, and Swardspeak / Bekimon (short for “Baklang Jejemon”) in the Philippines, all countries where verbally displaying as gay can be physically perilous in the extreme.

It’s a fascinating linguistic area, for sure.

Calabar Lesbian Cryptic Languages

Now adding to this existing array of public/private languages is Nigerian researcher Waliya Yohanna Joseph at the University of Calabar. Calabar is a port city, capital of Cross River State on the Nigerian border with Cameroon: incidentally, it’s a part of Nigeria I happen to have an (indirect) connection with.

Waliya’s article (which appears in Anna Odrowaz-Coates & Sribas Goswami (2017) Symbolic violence in social contexts. A post-colonial critique.) is called “Calabar Lesbian Cryptic Languages”, and is downloadable on ResearchGate.

As you would expect, using this cryptic language yields the speakers “emotional and sexual liberation, as well as anthropological security”, because “Lesbians preferred not to be known by the general public in Nigerian society.”

Though the basic form is numbers, there are also enciphered forms. One ciphertext is quoted:

Ba1baya ga3rala, 3 la4va2 5. 5 1ra2 sa4 ba215ta3fa5la. 3 na22da 5 3na maya la3fa2.
Wa3lala 5 la4va2 ta4 ba2 maya fara32nada? 3 para4ma3sa2 ta4 la4va2 5 1nada 5 1la4na2.
Ra2palaya !

The paper goes on to say that “violence against the female child among other things are also very prevalent in Single public schools where Senior students and self-declared school mothers force, entice and cajole the juniors in the hostels to practice lesbianism, abortion and prostitution.” It makes for uncomfortable reading at times.

At the same time, it has to be said that the author’s position seems to be perhaps a little overfocused on Calabar: the idea that lesbianism is perilous in Africa in general, even more so in Nigeria, and yet more so in Calabar doesn’t really do the paper justice. There also seems to be a moralistic tightrope being patrolled here, when the paper pitches itself as being…

of help especially to parents, teachers, lecturers, matrons, guardians and care givers who may wish to protect their wards from strange sexual orientations such as same sex relationships.

Still, it is a bold and interesting piece of work, shining a light not only on the edges of language and cipher, but also on the sharp differences between the specific mores of Calabar and those generally of “post-colonial” societies – particularly of those people reading the paper.