When I migrated this blog from Blogger to WordPress, all the accumulated Google PageRank “goodwill” got lost too: and so even though Cipher Mysteries has exactly the same text as Voynich News (but better categorized etc), it has a rather meagre PR2 (“PageRank 2”) rather than PR3. And so all the Google search-engine traffic to the blog disappeared overnight.

However, things might now be on the mend, and PR3 might just be within reach again. Google Blogsearch now rates Cipher Mysteries as the #1 blog for “Voynich” (which is a pretty good start), and rates it highly enough for a pre-list mention for “cipher”. And if you Google the web for “voynich”, Cipher Mysteries now pops up at #97 – pretty awful, agreed, but not bad considering that it was at #350 not so very long ago. 😮

The point here for all Googlers is that PageRank is a non-semantic algorithm that relies on the wisdom of the crowd of web content writers, and their supposed eagerness to link to authoritative content. Yet there’s a flaw: bloggers are not encyclopedists, but more like jackdaws with ADD, passing on links to that-which-sparkles 20x more than to that-which-is-of-use.

Moreover, sidebar lists of links (i.e. repeated across multiple pages with the same text) to external sites tend to yield practically zero weight in Google’s schema.

Put it together this way, and it should be brutally apparent that our perception of what is to be found with Google has become largely conditioned not by authority but rather by fashion. Google’s reliance on PageRank has therefore become a curiously double-edged sword, one singularly unable to help us cut through the swathe of semantic dross out there (such as, I don’t know, the 50+ million pages on “Paris Hilton”).

In short, I think the web may now be at the point where PageRank hinders Google’s ability to be useful – “the end of Google 1.0”, you might say. Not that a MegaCorp like that would stay still: perhaps it will split into two to cover the two very different types of searches – say, “Google Surface” (for superficial, faddy, fashionista searches, biased towards YouTube and blogs) and “Google Content” (for semantic searching, valuing original content over copied content).

To a certain degree, you can see this at play already: the more words you include in a search string, the more Google veers away from the PageRank-dominated Google Surface sphere towards the semantic-based Google Content zone. As it stands, Google functions as an uneasy alchemical marriage of these two: but I’m increasingly finding myself dissatisfied by its search results, and I can’t be alone in this.

Really, I’m not anti-Google at all (just imagine a world where we only had a Microsoft-branded search engine): but I do wonder whether we’re now close to (or even past!) the time for Google to reinvent its core search algorithms.

