In Essex County, New York in 1882, a mysterious man called Henry D. Debosnys was convicted of (and executed for) killing his newly-married wife, a widow by the name of Elizabeth Wells. He claimed to have been born near Lisbon in 1836, but refused to identify himself further, saying that to do so would bring shame upon his family.

From his time in jail, Debosnys left behind some drawings and sketches; some poetry in French (though fairly egotistical and shallow, it has to be said); and some cryptograms, at least one of which also seems to be a poem:-

Debosnys-Cipher-3a

Debosnys-Cipher-3b

However, nobody has so far decrypted so much as a word of any of these. I’ve put a complete set of scans on the Cipher Foundation website, and will post transcriptions there in a while, when I’ve worked out a good way of transcribing them (because doing so isn’t quite as easy as you might think).

I first found out about this cipher mystery from Klaus Schmeh’s presentation at the recent (2015) NSA cryptology history conference (as recorded by Rich SantaColoma), which focused on those historical affairs that involve both an unsolved cipher and an unsolved (or at least somewhat unexplained) crime.

Incidentally, there’s a 127-page book on the whole Debosnys affair – Adirondack Enigma: The Depraved Intellect and Mysterious Life of North Country Wife Killer Henry Debosnys by Cheri L. Farnsworth (2010) – which I have (of course) ordered and will discuss further when it lands on my doorstep.

But meanwhile, there’s plenty to be said about Debosnys’s ciphers themselves…

Initials in the Ciphers

Given that Debosnys claimed to have previously been married twice, the presence of a pair of holding hands next to the initials “L.M.F.” is suggestive of something to do with a relationship:-

debosnys-lmf

There are also some initials embedded in one of the cryptograms, with a number of dots below each letters.

debosnys-hddlmf

Given that the man’s name was Henry D. Debosnys, these H[4] D[8] D[7] L[6] M[6] F[5] patterns of dots suggest (if I have been able to count the dots correctly from the scan) that they code for H[enry] D[xxxxxxxx] D[ebosnys] L[xxxxxx] M[xxxxxx] F[xxxxx]. And my guess is that this will be true of sub-glyph dots elsewhere in the cryptograms. I would be interested to know what his middle name was, given that this pattern of dots seems to indicate its length.

Doubled Letters

From the cryptograms I have seen (and, needless to say, there may be more that I haven’t), there are only three glyph shapes that obviously appear in pairs:

debosnys-doubled

Of these three, the doubled dotted-X pair (six or so instances) appears much more frequently than the other pairs (one each): which makes me suspect it is a genuine letter. If I had to guess, though, I’d predict that this “XX” glyph pair stands in for the French “double-v” (i.e. the letter “W”), because of the absence of many other doubled glyphs in the text. Doubtless you’ll have your own opinion, though.

By the way, does anyone have the doubled-letter instance statistics for French?

Snakes (but no Ladders)

The mention of snakes (as ornamental design features) in the Copiale Cipher certainly had some kind of echo (perhaps inadvertantly, perhaps not) in the second La Buse cryptogram. And, curiously, six snakes appear integrated into the text of Debosnys’s cryptograms:-

debosnys-snakes

Even though I don’t yet know what is going on with these ciphers, I personally would be somewhat surprised if they turn out to have anything obviously to do with Freemasonry. But that’s just my opinion, make of it what you will (a crane, perhaps, or possibly a boat, depending on how good you are at origami).

Clustered Glyphs

These sit right at the heart of the problem we face when we try to transcribe Debosnys’s ciphers. So many of the glyphs we see are formed from a consistent set of subshapes that it looks very much to me as though many of these component pieces will turn out to be vowels, common letters or perhaps even spaces.

I’ve put a few of these clusters together here:

debosnys-clusters

So, what would be the best way of transcribing these clusters? It’s far from obvious to me: I suspect that as soon as you know how to transcribe them, you’ll also know how to read the whole text.

And The First Decrypted Word Will Be…

For me, there’s little doubt that the first decrypted word will be either ‘je’, ‘me’ or ‘moi’, and that it will be in the poem-like section of the cryptogram. This is simply because Debosnys’s French poem uses the words ‘je’, ‘me’ or ‘moi’ on pretty much every line, so it seems highly likely to me that his encrypted poem will turn out to have much the same (egotistical) profile.

But please feel free to form your own cipher theories. 🙂

As I mentioned not long ago, while spending an enjoyable afternoon last Saturday mooching round the London Library in an even-though-I’m-feeling-a-bit-lost-that’s-basically-OK kind of way, I found sufficient time to scan in the whole of Charles de la Roncière’s (1934) “Le Flibustier Mysterieux” and take home on a memory stick.

And now I’ve read it, I have to say it’s… really quite different from what I expected. The keel (if you like) of the book is de la Roncière’s quest to attribute the 17-line cryptogram to an Indian Ocean French pirate. He takes the approach of examining lots of pirate / treasure stories from broadly the right time and place, and seeing if he can use them to gain a glimpse of the mysterious man hiding behind the cryptogram’s curtain.

His book was clearly, I think, written for a popular audience: and even though he occasionally tries to affect academic detachment and skepticism of his sources, the raw evidence he’s moulding the whole thing from is simply too slight. Being brutally honest, I came to the book expecting a soupçon of the rigour and maritime erudition that he brought to his (literally) heavyweight six-volume “Histoire de la Marine Française” (from 1898 to 1932!): but found not so much as a single footnote. Perhaps he had footnoted himself out over those long decades.

On balance, though, perhaps that’s not so much of a issue for “Le Flibustier Mysterieux”, because it isn’t honestly that kind of a beast: rather, it’s both entertaining and an (on the whole) easy read.

All the same, I think it suffers from one big underlying problem, un éléphant dans la chambre: that History – and in particular historiography – has changed so much in 80 years that we would need the whole thing annotated and positioned within the context of what we now know in order to make proper sense of what he’s saying. Otherwise his book would be no more than a Parisian curio, a 115-page historical footnote (if you like).

Cipher Foundation Microproject #1

As I mentioned before, what I originally had in mind here was asking people to volunteer to transcribe two or three pages each from scans: but having now myself sat down and typed in sixty-five (small-ish) pages in one day, it became quickly clear that it would take far less time and effort to do it myself than to set up and coordinate a way of collaborating broadly to make that happen.

Hence what I want to aim for instead here is something a bit bigger, more thoughtful, and (I hope) more genuinely revealing; and something that overall better fits The Cipher Foundation’s charitable purpose (to “improve awareness of historical codes and ciphers”).

So rather than just transcribe it, what I’m now planning to do is commission an annotated translation of it. In short, The Cipher Foundation’s first historical cipher microproject will be: Translating “Le Flibustier Mysterieux”.

Inevitably, I haven’t worked out all the details yet (do you think Indiegogo would be the best crowdfunding platform? Or perhaps somewhere else entirely?) and it’ll take more than a few days to get the bank account, PayPal account, and the friendly-looking [Click Here To Donate] button all working etc. But it’s a plan, and – I hope you’ll agree – not an entirely bad plan either.

And if I do pretty much the opposite of everything Derek Abbott did with his attempt at crowdfunding, it should work out fine. :-p

Does that make sense to you?

I’ve just had a day at the London Library, thanks to a £15 Day Pass scheme they offer (though note you have to bring various forms of current ID with you, and to let them know in advance – you can’t just turn up).

The main reason I went there was to have a look at the only copy of Charles de la Roncière’s 1934 “Le Flibustier Mysterieux” that WorldCat knows of in the UK (much more on that another day), but in the meantime there’s a lot more to be said about London Library.

For a start, I have to say it’s maddeningly frustrating to the point of near impossibility to find your way around the place. Whereas full members get given a heavyweight induction (I suspect so that people don’t have the embarrassment of stumbling over a new member’s corpse, lying long-dead in a far unlit corner of History Level 6), day pass visitors get dumped in the deep end. Clearly, nobody cares if they live or die: so I’m just glad I got out alive. 🙂

As an aside, if you do want to find your way around London Library, my three top tips are:
(a) Because the building is in two halves (History/Science in one and Art/Language/Fiction in the other), the easiest / most reliable way to get from one to the other is all the way down to the reception area and back up again. Boring, but effective.
(b) Don’t be afraid to turn lights on yourself (most seem to be off, but you turn them on via pull-cords that are usually at the far end of the row of books you’re standing beside). Failing that, trace the wiring trunking above your head and you’ll find the cord about 50% of the time.
(c) If, like me, you want to look at the contents of “Philology, Cryptography” on the Mezzanine floor on the Arts half, ask someone to help you find it – I eventually stumbled upon it through sheer persistence (it’s shockingly similar to Platform 9¾ in Harry Potter), but that was definitely a poor choice on my part.

Speaking of “Philology, Cryptography”, the Library’s indexing scheme is just about as idiosyncratic as the Warburg Institute’s famously obtuse layout. The safest approach is to search the online catalogue to find at least one book you know is going to be there (say, David Kahn’s “The Codebreakers”, of which it actually has two copies), and work back to the book’s physical location from there. Once you’re in the stacks themselves, you’ll see all the other weird and wonderful books they have there, which is what using London Library is actually all about. (You can also do that virtually from the catalogue, though it’s not half as much fun).

Other nice things:
* If you bring along a USB stick, you can scan stuff on a funky-looking scanner for free (though it only let me store stuff as PDFs, and the adjustment roller on the left side was broken). But don’t forget to tap the on-screen SAVE button each time (easy to forget).
* If you don’t have a USB stick with you, Reception sells 2GB sticks in a range of colours for a very reasonable £2 each.
* If you’re scanning an oldish book, my advice would be to ask at the desk for a “snake” – a string containing a series of small leaded weights – to hold your book down nicely. Also: click the green horizontal bar to start a scan by squeezing it from above and below at the same time, or else your book may get disturbed.
* London Library has subscriptions to JSTOR, ProQuest and various other services; and even though the search PC itself is inaccessible, the trick is that the monitor has USB sockets on the side that you can plug your USB stick into (i.e. and save PDFs to, to read them at home).
* There’s a members lounge on the top floor of the Arts side… but I ran out of time before sampling its delights and rarified heights.

For me, probably the London Library’s nicest resources of all are its newspaper and journal archives. How extraordinarily splendid to have The Times, The Gentleman’s Magazine, and indeed Le Journal Des Savants all in one place, along with hundreds of others.

But… given that it’s a private library you have to subscribe to, would I really want to pay several hundred pounds a year for the privilege? Well… no. While it does have an excellent and properly eclectic collection of (over a million) books, I think being a member is far more about paying for serendipity: bumping into Stephen Poole’s “Unspeak”, or The BBC Guide to Radio Pronunciation for 1934 to 1937, etc etc etc. If you are a specialist researcher, it’s not so far across town to the British Library and its 170+ million items, a number which makes my jaw ache with droppingness every time I try to even think about it.

At the same time, if I wanted to go through a particular journal or book that the London Library had that wasn’t otherwise digitized, I’d happily pay £15 for a day pass for sure (as was the case here). It’s a nice experience, too (if you don’t mind feeling lost for half the day).

For any bibliophile (or indeed bibliophage) who finds themselves in London for a few days, I’d suggest that a day pass to London Library (it’s not too far from the Ritz Hotel, by Green Park) would probably be £15 very well spent. Cheaper than the London Eye! 😉

As part of the long slow process of fleshing out the Cipher Foundation’s website, I’ve added a new page there laying out the core evidence relating to the Anthon Transcript, a cipher-like document that sits right at the heart of the foundation history / mythology of the Mormon Church.

The short (non-TL;DR) version is that even though it has long been claimed that the Anthon Transcript (shown to Professor Charles Anthon in 1828) and the Caractors fragment are one and the same, a photograph that was unearthed in Clay County Museum in Missouri in 2012 seems to disprove this whole notion. Hmmm.

All the same, people continue to build high-rise cipher theories on top of this unsupportive sandy loam. Most recently, Jerry Grover announced his own fairly epic (251 pages of argument) Caractors translation, that renders the first four lines as:

In the nineteenth regnal year of Mosiah I, the Nephites traveled over the mountains to the foreign speaking people of Mulek. These twenty thousand ‘children of Mosiah’ traveled downriver on the east side of the River Sidon [Grijalva] for eighty days and reached Zarahemla. And then it came to pass that after ten years thus began the period of the Seven Tribes. After the space of twenty-one more years had passed, Zeniff, with sixty of his people, departed. Fifty-three more years then passed; then the Limhiites obtained twenty-four plates from the west in the Land of Desolation, returning upriver on the River of Lamanite Possessions [Usumacinta]. After their return upriver, seven years later, the Limhiites traveled west, bringing the pure gold Jaredite plates to Mosiah (II), which he translated. Previous to the arrival of the Limhiites, Benjamin was made King in the second month of the four hundred and thirty-sixth year after Lehi left Jerusalem. At the age of eighty-three, King Benjamin ascended to eternity, which was four hundred seventy nine years after Lehi left Jerusalem. King Benjamin’s death occurred one and one third years before the arrival of the Limhites. Four years before the arrival of the Limhites, the period of the Seven Tribes ended in conjunction with the Jubilee Year.

Personally, I’d assess the probability that this is correct is roughly the same as a truck load of lobsters falling out of a clear blue sky into my garden: in that I can conceive that it is (just about) possible and (broadly) consistent with the laws of physics, but (etc etc etc).

More generally, I’d offer this as a stark warning to idiot Voynich linguists such as Stephen Bax, as the kind of ultimate destination their foolish non-theorizing will ultimately lead them to.

I’ve previously blogged a number of times about Bernardin Nagéon de l’Estang: the short version is that I have yet to see a single piece of external evidence that he genuinely existed. A man with the right name did exist in the right place, but some 25 years too early for the dates: and so the reasonable – but as yet entirely unproven – presumption is that we should be looking for an unrecorded son of this man sharing his father’s name. The man certainly had several sons, not all of which are recorded… but that’s as far as we have been able to get.

The reason anybody cares about him is that he wrote (in French, translated here) that “…at our last battle with a large British frigate on the shores of Hindustan, the captain was wounded and on his deathbed confided to me his secrets and his papers to retrieve considerable treasure buried in the Indian Ocean; and, having first made sure that I was a Freemason, asked me to use it to arm privateers against the English.” Secrets and papers which treasure hunters have been speculating wildly about ever since.

In a post from April 2015 [which I managed to miss until very recently],
Emmanuel Mezino blogged about the evidence he had managed to dig up about Nagéon de l’Estang. From internal evidence, Manu reasons that the event where Nagéon de l’Estang claimed to have gained possession of “secrets” and “papers” from a dying French Freemason sea captain must surely have happened prior to 1789 [though personally I’m not so sure his logic holds]; and so Manu then winds the historical clock back to 1781-1783 when, in a series of five battles between Admiral Hughes’s squadron and Admiral le Bailli de Suffren’s squadron off the coast of Cuddalore, three French sea-captains died. Manu lists these as:

* The Chevalier Eleonore Perier de Salvert (whose life and Freemasonry connections are ably described here), commander of Le Flamand [50 guns];
* Captain Dupas de la Mancelière, Captain of the Ajax [64 guns];
* Capitain Dien, Commander of the fire-ship [probably 0 guns] launched under the orders of Capitain De Langle of Le Sévère [64 guns].

Manu thinks it probable that it was the Chevalier de Salvert whom Bernardin Nagéon de l’Estang was alluding to: and opening up H.C.M. Austen’s trusty “Sea Fights and Corsairs of the Indian Ocean” (which specifically covers this series of sea-battles in Chapter V), we find a report (p.188) of de Salvert’s death noted by William Hickey, who had met de Salvert several times on board his ship in January of that year:

“I was greatly concerned to to hear that in this action [the fifth and final sea battle] my worthy and respected friend the Chevalier de Salvert lost his life, being cut in two by a cannon-ball on the quarter-deck of the Flamand, while gallantly fighting his ship and encouraging her crew to use their utmost exertions to ensure success. I truly grieved at his death, notwithstanding he died fighting against my country, but that was no fault of his, and I firmly believe a better man never lived, such are the dire and lamentable consequences of war, the best men often being the most unfortunate.”

[Taken from “Memoirs of William Hickey”, published by Messrs. Hurst and Blackett, Ltd, but I’d be more interested in reading this in context in the original Vol III (or possibly Vol II?) than the abridged later version “The Prodigal Rake”.]

All the same, there must surely be many more accounts of this highly-respected Chevalier’s death in the archives yet to be found…

Manu goes on in a second post to recount how he found references to a certain Hélène Nagéon de Lestang, who married the creole poet Antoine Bertin at her stepfather’s property in Sainte-Domingue, and links this to the (nearby) 1770 birthplace of (the very real) Jean Marius Justin Nagéon de Lestang.

So that’s as far as Manu got with normal archival research, i.e. not really anywhere substantial. Close, but no cigar.

But then he pulls a gigantic rabbit from his hat, the testimony of Ali Loumi Ben Kace, as given in treasure hunter Patrice Hoffschir’s (2002) Bourbon l’̂île aux tresors:

“One day, in a sea port in Sicily, I drank too much: and woke up at sea on a pirate ship owned by Bernard Nagéon. I spent more than two years on this ship. […] In the Indian Ocean, we fought with two English corvettes, but we had to flee by night along the coast of Bourbon Island, with a broken main mast and sails, and with four holes torn in the hull. We were then stranded on a reef; and after throwing all the ballast overboard, the boat escaped the reef and we landed on the island. But the hull was holed on a rock and we were all forced to land there. Bernard Nagéon became almost crazy. Despite the waves, he ordered everyone to save what was possible. We managed to get a big chest and a barrel of gold ashore with the captain. […] I saw Bernard himself making marks in the lava rock: a heart and a “B9″ shape – everything is hidden there because both holes are now resealed. We left three weeks later on the galley of François Boivin of Saint-Malo, Bernard leaving everything concealed lest Boivin steals it all. […]. ”

Which, to my ears, sounds utterly peachy and completely made up. But… might it be true? There’s a little more on Hoffschir here, who goes treasure hunting with “une grande dose de spiritualité”. Hmmm…

For fans of the Somerton Man, there would seem to be no obvious end to the list of similar puzzling cold cases to snoop around. One I found recently first properly surfaced in October 2005 in an article by Carol Smith in the Seattle Post-Intelligencer called “The cipher in room 214” (though in the sense of a non-person ‘cipher’, rather than a cryptographic cipher).

This is the case of the woman who put her name down as ‘Mary Anderson’ when she signed in to Seattle’s Hotel Vintage Park on the 9th October 1996. As Smith wrote:-

She made no phone calls. Ordered nothing from room service. Instead, in some unknown sequence, she put out the “Do Not Disturb” sign, applied pink Estée Lauder lipstick and combed her short auburn hair. She wrote a note on hotel stationery, opened her Bible to the 23rd Psalm and mixed some cyanide into a glass of Metamucil.

Then she drank it.

mary-anderson

The note said:

To whom it may concern: I have decided to end my life and no one is responsible for my death. Mary Anderson.

“P.S. I have no relatives. You can use my body as you choose.

Like our acquaintance from a certain South Australian beach, the woman had no identification – no keys, no credit cards, no tags on her luggage, no fingerprint match. The name, New York address and phone number she had given were all false. And every tiny cluette, as with the Somerton Man, subsequently led the investigation nowhere.

To read more, there is a Doe Network entry, and – as you long-numbed Netizens doubtless already expected – a Mary Anderson cold case Facebook page, where recent postings highlight the suggestion that she may have been Mary Corinne Amos.

mary-amos

Though this is a possibility web researchers have long looked at, it all feels quite strange to me. Surely dental records and/or autopsy photographs should be able to rule this out or in very quickly? But this seems never to have happened, there’s no clear reason why not.

By way of comparison: in 2014, thanks to the Doe Network, a different Mary Anderson (Mary Lynn Anderson) was identified after three decades, closing an equally long-standing cold case. But it doesn’t seem obvious to me why Mary Corinne Amos hasn’t yet been forensically tested against the Room 214 ‘Mary Anderson’: so perhaps I’m missing something.

I don’t know: even though the ‘Mary Anderson’ and Somerton Man cold cases share similar problems of ‘taglessness’ (for want of a better word), I find the latter extremely hard to accept as a suicide. And that’s not because of a lack of suicide note (which are normally left in only a minority of instances), but rather because of a lack of… a whole load of different things. His death seems neither pre-planned, nor deliberate, nor misadventurous, nor even opportunistic. In that respect, the two cases seem to me to be worlds apart.

PS: when I tried to find ‘Mary Anderson’ on NamUs, I got absolutely nowhere: the cold case seems to have dropped off NamUs’s database. 🙁

A few weeks ago, I stumbled upon a Swiss book publishing website that was planning to re-release “Le Flibustier Mysterieux” in November this year. However, when I tried to find the website again a few days later, it had disappeared off the face of the Internet, which was a bit odd.

Then again, given that Charles de la Roncière wrote “Le Flibustier Mysterieux” in 1934 and died in 1941 (i.e. more than 70 years ago), his book would now seem to be out of copyright according to all the public domain copyright flowcharts I’ve looked at. So it would seem that there’s no obvious reason not to republish it in any format you like, if you want to.

Yet at the same time, there are no obvious digital copies of “Le Flibustier Mysterieux” available: while pirate treasure researchers jealously guard their 150-euro copies of it as if gold doubloons are stuffed inside their cover. (I’d have paid 100 euros myself to get a copy for my own cipher library, but I’ve always been too late to every copy to pick one up).

Surely someone can photograph or scan this somewhere and we can collectively divide up the pages into blocks and type it in, Project Gutenberg style? Think of it as a dry run for a Cipher Foundation microproject! 🙂

For a long time, I’ve been struggling to make genuine progress with many of the unsolved historical ciphers that I’m so interested in. Many of them suffer from what most would agree is an evidence shortfall, a lack that invariably leads both to a poor level of discourse and to a proliferation of wonky theories (which are arguably both sides of the same badly devalued coin).

For most ciphertexts, there is more and/or better primary evidence yet to be had: though (inevitably) researching, collecting, preparing, and publishing this in a useful way takes organization, time, and money. Of course, even though everyone would benefit from this kind of activity, nobody wants to actually do it themselves: it’s just too big a pain in the neck.

But rather than complain about this, I’ve instead decided to tackle the larger challenge myself: and to do this, have recently started a UK-based charitable foundation called The Cipher Foundation (though it is currently unregistered).

Its (as yet unfinished) website is meant to be a repository for relevant primary or secondary information about individual unsolved historical ciphers: and hence to form, in each case, far more of a practical resource than, say, Wikipedia. At the same time, the Foundation’s website is definitely not meant as a repository for cipher theories, or even people seeking validation for their cipher theories: rather, it is a means for collecting evidence able to raise the level of informed awareness about each of these mystery ciphertexts, and then for giving direct, unfettered access to it.

But how could the Cipher Foundation achieve such a lofty goal? After several months’ thought, I’ve decided that it should mainly function as a platform for discussing, designing, funding, commissioning, supporting, and publishing “microprojects”. These are small, evidence-based research tasks that aim to answer basic questions about unsolved historical ciphers that would probably never happen otherwise.

For example, there are a large number of specific microprojects that could be funded to improve our knowledge about basic aspects of the Voynich Manuscript, such as:-
* DNA analysis of bifolios;
* Microscopic imaging of individual marginalia letters;
* Raman imaging of specific layered features (e.g. f116v and numerous others);
* Making images and transcriptions of many 15th century herbals available to researchers;
…and so forth. And similarly for other unsolved ciphers, too.

Which of these microprojects should the Cipher Foundation be scoping, designing, funding, and commissioning? Right now, I don’t know – but in the long run, I suspect possibly all of them.

All in all, I want to be clear from the start that the intention is not that these microprojects should ‘solve’ historical ciphers, but rather that they should help ‘resolve’ specific uncertainties surrounding them, and thereby (hopefully) give historical codebreakers the best chance of solving them.

Nothing is set in stone as yet, and this is Day One of what will doubtless be many. Note that this site (Cipher Mysteries) will continue very much as it is, though its specific remit will doubtless shift slightly more towards qualified speculation as The Cipher Foundation’s website takes shape.

So… what do you think?

I’ve recently had a number of emails from Don of Tallahassee, describing various ways in which he thinks Voynichese can be decomposed into simpler subunits: broadly speaking, his scheme is similar to Jorge Stolfi’s well-known crust-mantle-core model, but with a very much larger base group.

Numerically, Don’s model works well: but – in my opinion – it doesn’t yet help us move towards what I would consider any of the basic milestones we would need to pass before we can crack the puzzle of Voynichese.

If anyone wants to be the Voynich Champollion, here is my list of the milestones you’ll need to tackle in your research programme, with various sample challenges. I don’t mind admitting that I haven’t yet succeeded at any of these: make no mistake, they are all hugely difficult.

(In the context of Don’s models, my opinion is that he – like many others before him, so it is in no way a criticism – has effectively skipped over the first three milestones, and gone straight for the modelling milestone. But we all need to get vastly more confident about the first three milestones before we can start doing modelling in an effective way.)

Milestone #1: Reading

Personally, I’m not convinced that we’re even reading Voynichese accurately off the page yet.

For example:
* Page-initial letters have quite a different instance frequency distribution from anything else, particularly in the Herbal pages. Why should that be?
* Line-initial letters have, again, a different instance frequency distribution as compared to text within lines. Is it therefore safe to assume that these are the same kind of text as each other?
* In 2006, I proposed that EVA ‘aiin’ characters may well represent Arabic digits, by steganographically enciphering the values using different shapes of the scribal flourish on the tail of the (‘v’-shaped) EVA ‘n’. This basic hypothesis needs to be tested microscopically and with careful imaging techniques, but my proposals to the Beinecke some years ago to do this were turned down.
* Philip Neal has pointed to evidence that certain stylized text sequences may be quite different from the rest of the text. There are both ‘vertical Neal keys’ (down the start column of many pages) and ‘horizontal Neal keys’, which often appear about 2/3rds of the way across the top line of a page or paragraph, and often ‘bracketed’ by a pair of single-leg gallows (‘p’ or ‘f’).
* In 2006, I proposed that Neal keys might form part of a tricky in-page transposition cipher (as described briefly by Alberti in 1467), where the gallows characters might form references to within key-like sequences. But this hypothesis has not been tested any further.

Challenge: when we try to decipher Voynichese, are we even trying to decipher the right thing? When there are so many different things that each suggest that the text as a whole is not an homogenous entity, why do so many people persist in treating it as if it is a single, simple language?

Milestone #2: Parsing

The second roadblock is that we can’t yet even parse Voynichese. Because of the ambiguities and weird letters, Voynich Manuscript researchers use a stroke-based transcription called ‘EVA’: this lets us transcribe the text and talk about it, even if we disagree (or are uncertain) about how these should be parsed.

For example:
* Is ‘ch’ a unique letter or is it a ‘c’ letter followed by an ‘h’ letter?
* Is ‘ii’ a pair of ‘i’ characters or a separate character?
* Is ‘ee’ a pair of ‘e’ characters or a separate character?
* Are ‘cth’ / ‘ckh’ / ‘cfh’ / ‘cph’ actually a t/k/f/p gallows character followed or preceded by ‘ch’, or four entirely separate composite letters?

Challenge: what kind of statistical tests would help us compare multiple different candidate parsing schemata, to help us decide which ones are more likely?

Milestone #3: Tokenization

The third roadblock is that there seems strong visual evidence that characters are not the same as tokens: which is to say that some individual letters in the plaintext may map to multiple letters in the Voynichese ciphertext.

For example:
* Is ‘qo’ a token?
* Is ‘dy’ a token?
* Is ‘o’ + gallows a different kind of token to just plain gallows?
* Is ‘y’ + gallows a different kind of token to just plain gallows?

Challenge: what kind of statistical tests would help us compare multiple different candidate tokenization schemata, to help us decide which ones are most likely?

(Note that Milestones #2 and #3 overlap sharply, making the process of getting past them quadratically more difficult, in my opinion.)

Milestone #4: Modelling

Even if we get to the stage that we are able to read, parse and tokenize Voynichese with some degree of certainty, we still face many grave difficulties, not least of which is that we have at least two ‘dialects’ to solve at the same time – Currier A, Currier B, and ‘Labelese’ (for want of a better term). For each of these languages/dialects, we need to model the language functioning and use the results to understand their internal structures.

For example:
* What do the contact tables between adjacent tokens suggest?
* Can we produce Markov models for each of these ‘languages’?
* Is ‘qo’ a free-standing unit (i.e. that is only steganographically prefixed to words), or is it genuinely an integrated part of words?

Challenge: what is the mapping between Currier A, Currier B, and Labelese? Can we somehow normalize the three such that they all conform to a single unified scheme? Or are there basic differences between them such that this is impossible?

Back in March 2014 (do you remember 2014? It all seems a bit of a blur), long-time Somerton Man researcher Barry Traish posted the results of his search for word sequences in Project Gutenberg that matched the (very probably) acrostic contents of the Somerton Man’s Rubaiyat note.

He looked for sequences whose word length was in the range eight to ten: and found 41 matches in the corpus’s 45,000 out-of-copyright texts. And here they are:

OABABDWT of a brighter and better day, when the
DWTBIMPA dynasty. When these became inevitable, M. Perier attached
TPMLIABO that point. My life is a bad one
LIABOAIA lad is a brave one, and I am
LIABOAIA literal inflicting a blow on an individual, and
LIABOAIA looked into a book of any importance, as
IABOAIAQ is a beautiful one, and I am quite
IABOAIAQ is as badly off as I am,” quivered
CITTMTSA care I took to make their stay at
CITTMTSA care is taken to make the strokes as
CITTMTSA castes. In the Tanjore Manual, the Shanans are
CITTMTSA Church in this town, Mr. Thomas Smallwood, an
CITTMTSA contemplating in turn the marshes, the sea, and
CITTMTSA conveying it to their master. The Sultan asked
ITTMTSAM I thought to myself that such a man
ITTMTSAM In talking to men–to such a man
ITTMTSAM in the textile, metal, transport, shipping, and machine
ITTMTSAM is that the men that stand around Me
ITTMTSAM it together, that Miss Thorpe should accompany Miss
ITTMTSAM itching to take me to see a man
TTMTSAMS tend to make them soft and mushy. Strawberries
TTMTSAMS than twenty miles…. There soon after midnight…. Steal
TTMTSAMS that transported me: To see a mind so
TTMTSAMS to the metropolis, to seize, at Maunsell’s shop
TTMTSAMS treat the matter too seriously, and merely said
TTMTSAMS Tshaka the Mighty, the swift and merciful stroke
TTMTSAMST* the tetragonal minerals tapiolite (= skogbolite) and mossite, so that
TMTSAMST that makes the sun and moon seem to
TMTSAMST to make their saloon a market, so that
MTSAMSTCA* me to stay; and, merely stopping to cast a
MTSAMSTG motioned the stenographer and Miss Snow to go
TSAMSTCA the sideboard; ask my sister to come and
TSAMSTCA the soldiers any more.” So the child and
TSAMSTCA the stronger, and more slimy) the Cores and
TSAMSTCA their ‘speech,’and ‘made strange their counsel.’ All
TSAMSTCA to seeke a more safe, then commodious abode
TSAMSTCAB* the scene. After mutual salutations the commissioners asked: “By
TSAMSTGA the same. All men seek to get as
TSAMSTGA the sincere among My servants to gain admittance
TSAMSTGA then summoned all my strength to gaze and
SAMSTGAB Street and Main Street, the grassy area between

Curiously, though, “66% are entirely on the last line”, which in fact highlights the difficulty you get when you try to find words that fit the other lines, particularly the first two lines. Moreover, none of the matches he found were to poems.

Why might this be? Even though Barry tried repeating the process with different letters in those cases where the letter shapes were ambiguous (e.g. M/W, etc), the results were essentially the same. Personally, I wonder whether this indicates something different: that perhaps a number of the guesses the unnamed policeman in SAPOL made for the first line were wrong… and hence that we don’t stand a chance. We really, *really* need a better scan of this page. *sigh* 🙁

But Barry’s pièce de resistance was the bacronymic poem that he composed back from the Rubaiyat initials. I think this is arguably the best attempt yet (I particularly like “and by and by” for ABAB 🙂 ), see what you think:

“My road goes on, and by and by divides,
Now two branches, into morning, past a new evening that provides,
My love is a barren oblivion, and itself alone quite certain,
It’s time to move the soul among magic stars, then gently asleep besides.”

Splendid, well done Barry! 🙂