I’ve just thought of this trick, and I don’t remember anyone suggesting it before. So here goes…
I’ve long suggested that the Voynich Manuscript was written using a combination of the latest (for the 15th century) techniques – abbreviation, verbose cipher, steganography, etc. And (I believe) a good part of the practical problem that this combination of techniques presents to codebreakers is that our analytical tools often assume that these techniques only happen one at a time: and also that they don’t interfere with each other.
Specifically, I strongly believe that Voynichese contains many verbose cipher pairs (qo / ol / al / or / ar / am) and even some verbose cipher blocks (ain, aiin, aiiin, air, aiir), and that the expansion this introduces is largely counterbalanced by abbreviation – truncation (“truncatio”) and contraction (“contractio”). There are a ton of other annoying tricks (e.g. horizontal Neal keys, vertical Neal keys, line-final -m, etc), but verbose cipher and abbreviation are arguably the Big Two.
Now… abbreviation is all very well, but as a technique it relies on the plaintext being very predictable. While this is true for normal text, what I’m pointing out is that there are always going to be a handful of places where an unpredictable name or string pops up in the plaintext, one that the encipherer isn’t convinced that the decipherer will know (even if the encipherer and decipherer happen to be the same person).
In the same way that cartouches highlighted the names of Pharaohs in the Rosetta stone (which led to hieroglyphics being deciphered), perhaps we can use statistics to identify unpredictable-looking blocks of letters. What I’m proposing here is that I suspect such blocks – virtual cartouches, if you like – may well be enciphering unpredictable names or strings in the plaintext. To be fair, I don’t believe that there are more than 10-15 of these scattered through the text: but all the same, this might be a hugely productive place to launch a fresh kind of cryptological attack from.
Now, I have a vague memory from 20-odd years ago of a heroic ‘Voynich whisperer’ who went out of their way to identify unpredictable looking blocks of text. I don’t believe it was Glen Claston (Tim Rayhel), but I might be wrong.
Can anyone remember who this was? Or has someone perhaps repeated the same process more recently? I don’t want to launch into this myself if someone has already done this. Thanks!
Nick – Just a suggestion. It seems the sort of data Julian Bunn might have registered when colour-coding the text.
On a separate question – As a non-cryptologist and no specialist in historical linguistics either, I wonder why everyone talks in terms of alphabets not phonemes?
Also if, as it seems, most people think the vegetal section is about medicine ad/or pharmacy, wouldn’t it be a good idea to compile a list of attested medico-pharmaceutical abbreviations from documents proper to whatever time and place may be posited as source for the ‘plain text’ ?
I wonder about such things but have no way to know if such questions are relevant to your work.
Nick,
maybe Mauro, Voynich.Ninja 25th June 2025?
cpholteedycfhoepaiin
saraloaly
dalkalytam
teodyteytar
chkaidararal
shoefcheeykechy
lkaltaraty
toroaldar
olkeealkchedy
psheykedaleey
opalkechckhy
salchtedytar
poldarairol
lolkedykain
dolarshydor
oteoodalsy
The (US) Library of Congress has a nice page on “Deciphering scribal abbreviations” (https://guides.loc.gov/manuscript-facsimiles/deciphering-scribal-abbreviations). One of the things it points out is that some abbreviations were lossy, i.e. there is no unique expansion and the complete word has to be determined from context: “When the letters q, p, b, l, h, and t and a horizontal or diagonal line through them, it meant that some letters were omitted which needed to be supplied by the reader…. Whenever a p had a line through its descender, the possible combinations are per, prae, pre, par, por, pro.” Needless to say, that degree of freedom (a) would make many people (understandably) cringe when it comes to the idea of words being abbreviated in the Voynich text, and (b) if that’s the case, complicates the task of deciphering the text enormously.
Diane, it’s hard to try to answer your question re: alphabets vs. phonemes because it’s not clear what you mean by the question.
Karl,
I’ve noticed recently that when it is said Voynichese records a(?) natural language, there’s a phrase tagged on …”with an alphabet” – as if languages which are recorded in any other form don’t count.
This puzzles me because when, say, Chinese is romanized, we don’t end up with as many symbols as there are written characters in Chinese . The difference is the number of sounds (phonemes) used in speaking a language. That’s what we record (well or badly) when representing it by an alphabet.
If any language can be recorded using an alphabet, it’s nonsense to try limiting the number of languages possibly informing Voynichese by asserting Voynichese can only represent a natural language with its own alphabet.
This led me to ask if the number of symbols used in Voynichese mightn’t be a clue to the number of phonemes in the plaintext language, the relationship between number of phonemes in a spoken language and the number of alphabetic letters used to represent it is not at all what a novice like me would expect – a table in the wiki article says with all due reservations and caveats that Greek and Sino-Tiibetan, for example, have the same number of phonemes (42) whereas Ubykh has (had) more than twice as many (86-88).
This made me wonder if possible plaintext languages were better eliminated by their number of phonemes rather than by their own writing system.
After all, there’s no historical or practical objection to Baresch’s understanding that some good soul travelled to eastern parts and brought it back. I don’t suppose he went so far as China but the Voynich glyph-set serving as a kind of universal ‘alphabet’ seems fair enough to me, given the times and the geographic range required to account for the manuscript’s pictorial content.
PS – another of those ‘meme-facts’ being circulated is one suggesting Marcus Marci, rather than Baresch, studied the manuscript and wrote the parallel ‘alphabet’ noticed by Fagin Davis. Where she got the idea I can’t imagine. There’s no sign of Marci’s ever working on it – perhaps Baresch (who certainly did, for about 30 years) hadn’t enough snob value to interest whoever started the meme which seems to have reached Lisa.
Diane: Lisa explained that she came to her conclusion that Marci was a “good match” for the parallel alphabet by comparing it to a range of handwriting samples, which also included Baresch’s. This was all in a post on her blog…and yours is the first comment on it. Even if you disagree with this conclusion, “snob value” and “meme fact” hardly seem fair terms to be using here.
https://manuscriptroadtrip.wordpress.com/2024/09/08/multispectral-imaging-and-the-voynich-manuscript
(Speaking of misrepresenting things, I know I owe you a response to a very strange comment of yours or two when I find them again amidst my tabs, but quite frankly I had no idea what you were talking about and how to respond.)
Going back on topic, I wasn’t around twenty years ago but Mark shared a long list with the Voynich Ninja just two years ago in a thread called “Unusual Words” https://voynich.ninja/thread-4058.html
Tavi – thanks for the reminder.
I’ve re-read that post and see that although Lisa initially discounted Marci on the basis of the ‘letter of gift’, Rene Z. told her that letter had been written by a secretary, so she accepted that assertion nd turned to another document to make her comparison.
There is another statement i that post by Lisa that seems to have been taken on trust from Rene, but for which there are no grounds offered by the historical evidence, viz. “The manuscript stayed in Rome until Voynich acquired it in 1912.”
We know it was sent to Kircher in Rome. We no nothing more of its adventured for the following two hundred years and more when it turns up in Fr. Beckx’ trunk after his return from the years in exile in Florence, where he stayed in a house on property that had belonged to the Medici – in Fiesole above Florence. There is simply no evidence for the interim and a resounding absence of evidence from any catalogue compiled in Rome through that time.
If Marci tried to tackle the ms himself, we might hope to find mention of the fact in one of the letters from Marci to Kircher in the Kircher archives. After Baresch’s death, if Marci’s eyesight was so poor he couldn’t write his own letters, it seems hardly likely he’d be struggling with Voynichese. And whether or not his eyesight was failing his mind certainly was – he’d ‘forgotten almost everyhing’ as Kinner wrote. Also, if Marci had earlier (1630s) been assisting Baresch, why should Baresch have had to communicate with Kircher via Kinner as we know he did – and indeed why should Baresch have had to write to Kircher at all – why not Marci all along? Kircher was a notable snob and, as we know, refused to respond civilly to what he had been sent by Baresch.
I’ll happily accept that, from the range of documents Lisa consulted – and after being directed away from the ‘letter of gift’ – that Lisa is justified in saying the nearest match from what she saw was something from Marci, but (for instance) what do we know of Jakub H’s handwriting? – and after all, it was his name in it.
I would settle for saying that the hand resembles one taught to children educated by the Jesuits in early seventeenth century Prague. Of course it won’t be like Kircher’s. 🙂
Has anyone produced a paper setting out the evidence and argument for concluding that the ‘letter of gift’ must have been written by a secretary? I’d be interested to read it, and credit its author.. until then, I’m uncomfortable about repeating information whose original source might turn out to be a bit of kite-flying or equivalent to “a bloke told me”.
Nick, answering your question, unfortunately I do not remember this.
Diane, re: your question “… if the number of symbols used in Voynichese mightn’t be a clue to the number of phonemes in the plaintext language”, I suspect the answer is no. To define the relevant terms (https://sounds-write.co.uk/what-are-phonemes-and-graphemes/):
* phonemes are the smallest units of sound that get combined to make (spoken) words;
* graphemes are combinations of one of more letters that are used to represent phonemes — “sh” in “ship” is a two-letter grapheme representing one of the three phonemes making up the word; “eigh” in “weight” is a four-letter grapheme corresponding to a phoneme making up part of a three-phoneme word.
*Even if* you could figure out a way to extract the underlying set of graphemes from a sample of alphabetically written text in an unknown language (and I think that’s probably a big if — I’ve seen a variety of algorithms for trying to find *morphemes*, but that’s a different animal), here’s the problem: graphemes don’t match one-to-one with phonemes. As the page cited above says,
“Some phonemes can be spelled with different graphemes. The sound /k/ can be spelled with the , , or graphemes. It can even be spelled in words like chemistry! Many graphemes can represent more than one sound. In words such as odd, no, son and to, the grapheme represents 4 different phonemes: /o/, /oe/, /u/, /oo/.”
So figuring out the number of graphemes the written form of the language uses (which is all you can get from the written language) can’t really tell you how many phonemes those graphemes are used to represent. It can’t even give an upper or lower bound.
None of which is to suggest that there is anything wrong with the idea that the Voynich text could be a phonetic rendering of some non-European language (which may or may not have had a pre-existing writing system of its own) into an otherwise unattested alphabetic script. As always, the issue boils down in converting that into a testable hypothesis and finding a text corpus to use to do the testing…
Karl,
Thanks for the response.
Here’s my wiki source for the number of phonemes per language-group.
https://en.wikipedia.org/wiki/List_of_languages_by_number_of_phonemes