According to this recent Wired article, Rajesh Rao, a computer scientist from the University of Washington, has run a Markov chain finder on the 1500-odd fragments of (the as-yet-undeciphered) Indus script – and has ‘discovered’ that it is “moderately ordered, just like spoken languages“.
Well, ain’t that something.
In a depressingly familiar echo of the ‘hoax’ debate over the Voynich Manuscript, the most important result is that it argues against Steve Farmer’s (2004) case that the Indus fragments were merely “political and religious symbols“, i.e. not a language at all, but just odd visual propaganda of some sort.
Language is a tricky, evolving, misunderstood, dynamic artefact that typically only has meaning within a very specific local context. The failure of linguists to “crack” the Indus fragments (all of which are very short) is no failure at all – we are massively disadvantaged by the passing millennia, and cannot easily trace the structure within the flow of ideas (the perennial intellectual historian hammer).
Having said that, what I read as Farmer’s basic idea – that researchers have for too long looked for a definitive script grammar as an indicator of advanced literacy – is an excellent point. And so the notion that Indus script analysts should perhaps be instead looking for some kind of arbitrary / non-formalized explanation (a confused model, rather than a complex one) is sensible. My opinion is that Farmer is overplaying his skeptical hand, and that the script is very probably communication (as opposed to mere decoration) – but is it written in something we would recognize as a language? Apparently not, I would say.
Incidentally, Indus script uses roughly 300-400 symbols (depending on how you count them), with the most frequent four symbols making up about 21% of the texts: inscriptions (many on potsherds, also known as ostraca) are all short, with an average length of only 4.6 symbols. All of which makes the script completely unlike known languages – but all the same, what is it?
Perhaps Rajesh Rao’s Markov models will reveal some kind of pointers towards its hidden structure, towards the truth – but as to Rao’s suggestion that they may well yield a “grammar”… I suspect not.
PS: Farmer cites Gabriel Landini & Rene Zandbergen’s paper (funny, that), though points out that Zipf’s Law is an ineffective tool for differentiating language-based texts from non-language-based texts. Just so you know…