06May 2017

What is the point of EVA? Everyone seems to have forgotten… :-(

In the Voynich research world, several transcriptions of the Voynich Manuscript’s baffling text have been made. Arguably the most influential of these is EVA: this originally stood for “European Voynich Alphabet”, but was later de-Europeanized into “Extensible Voynich Alphabet”.

The Good Things About EVA

EVA has two key aspects that make it particularly well-adapted to Voynich research. Firstly, the vast majority of Voynichese words transcribed into EVA are pronouncable (e.g. daiin, qochedy, chodain, etc): this makes them easy to remember and to work with. Secondly, it is a stroke-based transcription: even though there are countless ways in which the inidvidual strokes could possibly be joined together into glyphs (e.g. ch, ee, ii, iin) or parsed into possible tokens (e.g. qo, ol, dy), EVA does not try to make that distinction – it is “parse-neutral”.

Thanks to these two aspects, EVA has become the central means by which Voynich researchers trying to understand its textual mysteries converse. In those terms, it is a hugely successful design.

The Not-So-Good Things About EVA

In retrospect, some features of EVA’s design are quite clunky:
* Using ‘s’ to code both for the freestanding ‘s’-shaped glyph and for the left-hand half of ‘sh’
* Having two ways of coding ligatures (either with round brackets or with upper-case letters)
* Having so many extended characters, many of which are for shapes that appear exactly once

There are other EVA design limitations that prevent various types of stroke from being captured:
* Having only limited ways of encoding the various ‘sh’ “plumes” (this particularly annoyed Glen Claston)
* Having no way of encoding the various ‘s’ flourishes (this also annoyed Glen)
* Having no way of encoding various different ‘-v’ flourishes (this continues to annoy me)

You also run into various annoying inconsistences when you try to use the interlinear transcription:
* Some transcribers use extended characters for weirdoes, while others use no extended characters at all
* Directional tags such as R (radial) and C (circular) aren’t always used consistently
* Currier language (A / B) isn’t recorded for all pages
* Not all transcribers use the ‘,’ (half-space) character
* What one transcriber considers a space or half-space, another leaves out completely

These issues have led some researchers to either make their own transcriptions (such as Glen Claston’s v101 transcription), or to propose modifications to EVA (such as Philip Neal’s little-known ‘NEVA’, which is a kind of hybrid, diacriticalised EVA, mapped backwards from Glen Claston’s transcription).

However, there are arguably even bigger problems to contend with.

The Problem With EVA

The first big problem with EVA is that in lots of cases, Voynichese just doesn’t want to play ball with EVA’s nice neat transcription model. If we look at the following word (it’s right at the start of the fourth line on f2r), you should immediately see the problem:

The various EVA transcribers tried gamely to encode this (they tried “chaindy”, “*aiidy”, and “shaiidy”), but the only thing you can be certain of is that they’re probably all wrong. Because of the number of difficult cases such as this, EVA should perhaps have included a mechanism to let you flag an entire word as unreliable, so that people trying to draw inferences from EVA could filter it out before it messes up their stats.

(There’s a good chance that this particular word was miscopied or emended: you’d need to do a proper codicological analysis to figure out what was going on here, which is a complex and difficult activity that’s not high up on anyone’s list of things to do.)

The second big problem with EVA is that of low quality. This is (I believe) because almost all of the EVA transcriptions were done from the Beinecke’s ancient (read: horrible, nasty, monochrome) CopyFlo printouts, i.e. long before the Beinecke released even the first digital image scan of the Voynich Manuscript’s pages. Though many CopyFlo pages are nice and clean, there are still plenty of places where you can’t easily tell ‘o’ from ‘a’, ‘o’ from ‘y’, ‘ee’ from ‘ch’, ‘r’ from ‘s’, ‘q’ from ‘l’, or even ‘ch’ from ‘sh’.

And so there are often wide discrepancies between the various transcriptions. For example, looking at the second line of page f24r:

…this was transcribed as:

qotaiin.char.odai!n.okaiikhal.oky-{plant} --[Takahashi] qotaiin.eear.odaiin.okai*!!al.oky-{plant} --[Currier, updated by Voynich mailing list members] qotaiin.char.odai!n.okaickhal.oky-{plant} --[First Study Group]

In this specific instance, the Currier transcription is clearly the least accurate of the three: and even though the First Study Group transcription seems closer than Takeshi Takahashi’s transcription here, the latter is frequently more reliable elsewhere.

The third big problem with EVA is that Voynich researchers (typically newer ones) often treat it as if it is final (it isn’t); or as if it is a perfect representation of Voynichese (it isn’t).

The EVA transcription is often unable to reflect what is on the page, and even though the transcribers have done their best to map between the two as best they can, in many instances there is no answer that is definitively correct.

The fourth big problem with EVA is that it is in need of an overhaul, because there is a huge appetite for running statistical experiments on a transcription, and the way it has ended up is often not a good fit for that.

It might be better now to produce not an interlinear EVA transcription (i.e. with different people’s transcriptions interleaved), but a single collective transcription BUT where words or letters that don’t quite fit the EVA paradigm are also tagged as ambiguous (e.g. places where the glyph has ended up in limbo halfway betwen ‘a’ and ‘o’).

What Is The Point Of EVA?

It seems to me that the biggest problem of all is this: that almost everyone has forgotten that the whole point of EVA wasn’t to close down discussion about transcription, but rather to enable people to work collaboratively even though just about every Voynich researcher has a different idea about how the individual shapes should be grouped and interpreted.

Somewhere along the line, people have stopped caring about the unresolved issue of how to parse Voynichese (e.g. to determine whether ‘ee’ is one letter or two), and just got on with doing experiments using EVA but without understanding its limitations and/or scope.

EVA was socially constructive, in that it allowed people with wildly different opinions about how Voynichese works to discuss things with each other in a shared language. However, it also inadvertantly helped promote an inclusive accommodation whereby people stopped thinking about trying to resolve difficult issues (such as working out the correct way to parse the transcription).

But until we can start find out a way to resolve such utterly foundational issues, experiments on the EVA transcription will continue to give misleading and confounded results. The big paradox is therefore that while the EVA transcription has helped people discuss Voynichese, it hasn’t yet managed to help people advance knowledge about how Voynichese actually works beyond a very superficial level. *sigh*

Posted in: Voynich Manuscript

24 thoughts on “What is the point of EVA? Everyone seems to have forgotten… :-(”

James R. Pannozzi on May 7, 2017 at 11:38 am said:

I avoid it whenever possible, it is nothing but an impediment as your “things wrong” notes imply.

Since I subscribe to the Nahuatl variant theory, I use Dr. Tucker’s transliteration method. (BONUS: Even if wrong, the Nahuatl approach introduces one to all sorts of fascinating MesoAmerican herbs and languages, it’s great fun, informative but, caution….highly addictive),. From EVA, no such benefits accrue but one does get to think like a cryptanalyst of the 1940’s.
Young Kim on May 7, 2017 at 8:25 pm said:

I am not soliciting anything here, but I wonder if you might get interested…
https://journals.equinoxpub.com/index.php/JRDS/article/view/26865
Mark Knowles on May 7, 2017 at 10:13 pm said:

Young Kim: It appears one has to pay to download your paper is that correct?
Young Kim on May 8, 2017 at 9:44 am said:

Mark Knowles: That is correct. If you can locate a hard copy of JRDS journal at a library, I think you could make a copy.
Mark: on May 8, 2017 at 2:52 pm said:

Young Kim: Am I right in thinking you haven’t actually managed yet to translate any of the manuscript yet?
Wladimir D on May 10, 2017 at 3:24 pm said:

I think that we need an adjustment (or the development) of a new alphabet.
First of all, to us need to add “•” as the base symbol.
The first argument is that if the left (or right) foot of the bench turns into a blob (dot), in most cases it decreases in a vertical dimension in comparison with the “e”.
2 / I can not understand how it is possible writen “e” in the one word “chcthso” three times inaccurately (f100r, first line, penultimate word) .
3/ without the symbol “•” it is impossible write explain the difference in the statistics of the transformation of the independent “e” into a point (blot), “e” in the right and left legs of the “bench”.
The transformation of an independent “e” into a blot (or point) is about 10-15 cases.
The transformation the right foot of the bench into a point – less than 10 cases.
The transformation the left foot of the bench into a blot – more than 150 cases. In this case, the ink is added at the second pass to distinguish the point and “e”.
See examples in the topic
PS / Code 188 (V101) occurs f2r, 72r2, 67v2, 114r (5 line), 105r (16), 24r (7,10), 111r (30).
Wladimir D on May 10, 2017 at 4:01 pm said:

topic ARTIFACTS IN THE TEXT https://voynich.ninja/thread-809-page-2.html
Young Kim on May 10, 2017 at 6:56 pm said:

Mark Knowles: That article explains how Voynich text can be translated without identifying its language. It used the whole 1st paragraph on Folio 2r as an example.
Peter on May 11, 2017 at 6:27 am said:

The Problem With EVA:
One builds itself a wall and turns afterwards only in the circle.
You make your mistakes and is also convinced that they are correct. Because you can rely on EVA.

And that is wrong.

The key, the resolution of the characters in the book, does not match the entry in the EVA.
It looks so, but it is not.
Philip Neal on May 11, 2017 at 10:31 pm said:

The idea behind NEVA is to take every distinction between characters in the Voynich script perceptible to Claston and apply an algorithm which translates it into something resembling EVA. Researchers could then specify their analysis of the script in a standardised form, e.g. “All Claston’s variants of EVA >a< are assumed to be allographs”. I have extended this idea to the structure of the page, defining all different types of word divider which might conceivably be distinct. The next step would be to apply the principle to the allographs of each type of stroke which are implicit in the Claston transcription to arrive at a definitive low-level transcription which anyone could parse into a preferred variant of EVA and be able to specify exactly what simplifying assumptions they had made. Would this solve the problem and would anyone be interested in agreeing a standard of this kind?
Rene Zandbergen on May 12, 2017 at 7:32 am said:

Philip,

I fully agree that it is time for a ‘next generation’ transcription system. The character set should be a superset of both (extended) Eva and GC’s v101.

As I wrote some time last year (in the Ninja forum), I could imagine this to be organised as ‘character families’, where variants of different characters are grouped. It would no longer be sufficient to represent a character by a single byte.

The transcription(s) would be stored in a database, and the types of files we are now using should be the result of a query to this database.

All this is only possible through collaboration.
Collaboration also requires the definition of some standards and formats that are used by everyone. This will allow exchange of results and independent verification through repetition.
(This is exactly how people all over the world collaborate in my professional area of work).
Petebowes on May 12, 2017 at 8:50 am said:

Rene, would this collaboration include linguists?
Peter on May 12, 2017 at 9:16 am said:

Example 2, Deception
I am today the opinion that the differences are already in the characters.
So I think that a round 4 does not have the same value as a pointed 4.
I also find these differences in other characters. Random or not? It would explain why all words look similar, but it is not really.
I can not find this difference in normal EVA.
Think however that it is a way for the decoding is.
Simple but effective
nickpelling on May 12, 2017 at 1:29 pm said:

Rene & Philip: I’m very happy to hear that it’s not just me who thinks EVA is in need of something of an overhaul. 🙂

GC’s transcription did improve on EVA in many areas, but he was in thrall to Strong’s attempted decryption, and that didn’t always lead to great decisions (I’m thinking in particular about the ee / eee ligatures, that I think can be extraordinarily hard to tell apart).

One key problem with all Voynich transcriptions made to date is that you can always find a good number of Voynichese words that any given model doesn’t quite encompass: and so the issue of how to transcribe awkward and/or ambiguous stuff is also one that needs considering.

Finally, I’m still (a decade on) extremely suspicious about the scribal flourishes at the end of aiin-family word-endings. I would feel much more comfortable if I could physically test these before trying to help design EVA 2.0. :-/
Rene Zandbergen on May 12, 2017 at 5:16 pm said:

One of the biggest problem is related to the spaces.
In many cases, decisions were made by transcribers based on that they ‘knew’ were valid words.
The only clean way out of that is not to make these decisions as part of the transcription.
In other words: for every character (or stroke) its coordinates need to be recorded separately. This may look like a daunting task, but it is precisely the sort of problem for which expertise is abundantly available in the Voynich community. I am sure that we have more software engineers than medievalists.
In fact, this problem was already partly solved (on a word basis) in the tool at voynichese dot com.

A number of basics should be agreed, including the use of a consistent ‘coordinate system’. Again, there is a solution by Jason Davies, but I think that it should be based on the latest series of scans at the Beinecke (they are much flatter). My proposal would be to base it on the pixel coordinates.

(Another problem for which a clever solution would then be needed would be to transform between Jason Davies coordinates and the newer scans. This is a standard type of problem that has already been solved many times.)

Then we need an inventory of where all the text is. All known transcriptions miss a bit here or there.

How to store and represent meta data: standard solutions used for other texts should be reused wherever possible.

All decisions should be strictly free from any kind of theory.
The only thing that has to be done is to capture shapes.

Finally, all historic transcriptions should be kept as they are, but it should be possible to represent them in whatever new format is decided.
Josef Zlatoděj Prof. on May 12, 2017 at 7:09 pm said:

Friends, ants, Nick and Rene. You work hard. But I’m sorry. It’s still not enough. Ask a question. Where was the manuscript ? ( Bohemia ). Who was Marci ? ( Czech ). What does M. Voynich write to you in a letter to Yale ? He writes the manuscript is a Czech book ! ( Concerninq : substitution = Czech kniha ). And the writes instructions for translation continue. ( Same as on page 116. Key ).

Now I’ll help you again. Page 35 v .
Diane writes that there is ivy. But that’s wrong again.
On page 35v, write the number 8.
Manuscript are numbers. No botany, herbs, recepes, alchemy, alien, astology, etc.
Anyone who looks at the 35 v page should see number 8 ! 🙂
The author writes there what number 8 means ! He writes it in Czech , of course.

Nice day your teacher.
Philip Neal on May 12, 2017 at 8:00 pm said:

Nick, Rene

As I see it, the EVA-EVMT project involves a data layer and a presentation layer. The data layer is a stroke-based analysis of the Voynich text, and the presentation layer is an intuitive mapping of Voynichese characters onto the Latin alphabet retaining backwards compatibility with previous mappings. The two layers together made it possible to display all the existing transcriptions from FSG on as variant parses of the same underlying text.

The various problems which Nick identifies with EVA mainly have to do with the data layer. In particular, EVA does not account for all the differences between the transcriptions at the level of the glyph because the analysis into strokes and ligatures is not sufficiently fine-grained. Claston partly solved this problem by distinguishing all variant forms of the glyphs which it is possible to identify, including many which are clearly insignificant, and to that extent his transcription has all the information content of the others combined. The main point of NEVA was to give v101 the intuitive readability of EVA using Unicode and UTF8, dropping Claston’s unnecessary use of one byte per glyph.

It seems to me that a superset of Claston and EVA would need to take Claston’s hairsplitting approach down to the level of the individual strokes and the ways in which they are combined with, or separated from, their neighbours. There would then arise a need to specify how to parse sequences of strokes into plausible sequences of glyphs and words at the granularity of EVA, in accordance with agreed standards and formats of the kind Rene envisages.

If people are interested in collaborating on a next generation transcription scheme, I think v101/NEVA could fairly easily be transformed into a fully stroke-based transcription which could serve as the starting point. I can also contribute new detail on the word separators, radials etc. and the neglected topic of the space above and below the characters – text above, blank space above etc. I have derived new statistical results from an analysis incorporating this information.
Rene Zandbergen on May 13, 2017 at 6:25 am said:

One thing I would like to point out is that , among all transcriptions in the interlinear file, the only one that was really done in Eva was the one from Takeshi. He did not use the fully extended Eva, which was probably not yet available at that time.
All other transcriptions have been translated from Currier, FSG etc to Eva.
The ‘metadata’ (page variables) were added to the interlinear file, but this is indeed independent from Eva. It is part of the file format, and could equally be used in files using Currier, v101 etc.
(The only problem is that v101 might use some of the brackets that have a special meaning in the file format. This can, however, easily be solved.).

On the point about missing information about spaces between lines: I fully agree. We have a lot of data to do ‘language’ statistics, but no numerical data to do ‘hand’ statistics. This would, however, be solved by what I described: have the locations of all symbols recorded plus, of course their sizes. Where possible also slant angles. This would all come out of some OCR-like process. Again a great opportunity for our software specialists.
I would call it ‘assisted OCR’ since we already have a fairly good first guess of the text.

Again this may seem very difficult, but far more complicated things are solved routinely all over the world.
Peter on May 13, 2017 at 7:03 am said:

@Josef Zlatoděj Prof.
Why did M. Voynich write this? He may have read the name Jacobj ‘a Tepenece on the first page, and knew that Marcus Marci was at the same time at the court of Rudolf II. This has led him to a false assumption. But I can understand well, he also had no C14 analysis.
An empty period of 200 years. It would not surprise me if one of these two people had turned Rudolf’s own book from a family afterlas for 600 guilders.
We also find German text in the VM. The existing clothing goes on the 1400, the battlements only occur south of the Alps. The crowns are typically Hapsburgs 1400. And some more hints.

I’m sorry, but your production location in czech republic is based on a presumption from 1912.
Look, I’ve done my homework.
Josef Zlatoděj Prof. on May 13, 2017 at 10:54 am said:

@Peter. and ants.
I’ve translated the whole manuscript. So I know and know what is written in the manuscript. Everithing is written in Czech. Of course, a copher is also used. There is no evidence that Rudolf owned the book. Everything is just one lady said.
Walls. Tails of swallows. 🙂 Here is important to know and know what is written in the rosette.
Thn it is important to know what the ” Fish “. ( symbol means).
Anyone can see the fish there. That fish waves on you. And that’s the author of the manuscript. ( Eliška ). It’s writte in the rosette. Turn circle. ( Of course it is written in Czech ). Turn around with a circle. So you connect your head to the body. ( The upper falf of the body ). You will see the lower part of the body. Right next to the middle rosete. The person stands at the mine. ( What the ant sees as a rocket ). A leather bag is drawn. From which water flows.
The mine belonget to the Rosenbergs family. The silver was mined in the mine. ( The area is now called Rudolfov ). The mine is drawn in the left rosette. And he was named ” Old Man “. ( Czech language = Starý muž ).

To see it all, turn the whole parchment. Left.

As for the castle . 🙂
So he has not been preserved till nowadays. Only the tower was. Tower is called , Jacobine.
Walls in front of the castle. No tail of swallows. Walls are letters. And they are part of the key. 🙂 Walls – protection. ( symbol ).

Otherwise the family of Rosenberg ( Czech language – Rožmberk ) han many castles and castles. There were more alchemists to him than to Rudolf II. It was Kelly and Dee. And for a long time.

Teacher.
Mark Knowles on May 13, 2017 at 1:08 pm said:

Josef Zlatoděj: I am not surprised by your theory as it is typical that the Turkish guy tells you that the manuscript is Turkish and the Russian guy tells you it is Russian and the Czech guy tells you the manuscript is Czech.
Josef Zlatoděj Prof. on May 13, 2017 at 5:11 pm said:

Mark Knowles : I know what you write. But I’m sorry. All you write is wrong.

You can not find out what the word is saying : aiin.

Of course I know what the word means. 🙂
For the second. I do not write theory. The theory is written by Zandbergen, Nick, you and other ants.
I show you all the way. And what is the ultiamte solution. I’m writing the final solution. So, learn. And the other ants will learn too.

Scientists and academics do not know much, so I’m trying help them.

Teacher.
Rene Zandbergen on May 14, 2017 at 6:07 pm said:

More on the topic of the original post, and the constructive discussion that developped around it:

The reason to go for a next-generation transcription is, to have the best possible data available.

It is not true (as has occasionally been stated) that attempts to understand the Voynich MS failed because the transcriptions were faulty.

Even the earliest transcription systems (FSG, Currier) are sufficient to show the problems and the features of the text.
Where it has always gone wrong is in the interpretation of the statistics.
Basically, the numbers are (usually) correct, but it is not always clear what they mean.
bdid1dr on May 16, 2017 at 6:48 pm said:

Oh well, Nick ! It appears to me that you are attempting to “dump” those persons who don’t believe that the EVA is a form of translation. So be it !

I’ll be devoting my attention to the ‘Somerton Man” mystery — which language is quite intelligible to me. At one time in my lifetime, my neighbors were Judaic. My neighbors on the other side of my house were Muslim. They were all concerned about my being underweight. So—often, there would be a knock on my door. By the time I got to the door, there would be an aluminum pie plate full of delicious ‘home made’ falafel, or noodles in a cream sauce, or roasted vegetables AND fruit.
Both of my neighbors and their spouses and children rarely ever had a conversation with me. They would, however, dance by my house, when they heard the music tapes I was playing. This would happen when I lived in San Franciso — and years later when I lived in San Jose.

I am hoping to go to San Jose’s Greek Food and Dance Festival in June: OPA !
How’s that word translated with EVA ?
Today I’m hoping to do a loop around our mountain community — with my electric bicycle — roughly 15 miles.
bd