Because Google is like a jetcar with a 20-speed manual gearbox, first gear is plenty for most people. However, if you want the other 19 gears, here are some ideas to get you fired up (just make sure you’re pointing in the right direction first)…
Google’s 2nd gear – Exact-fu
Without much doubt, I think the two basic Google tricks everyone should know are:-
- If you want an exact word match (i.e. not a nearest sound match, or a plural/singular), precede the word with ‘+’. This is most useful when (as is often the case with historical research) you’re looking for a particularly obscure word or name, for which Google will suggestion zillions of alternatives. For example, if you want to search for Sirturi (but don’t want the 46,000 hits returned by Sireturi), search instead for +Sirturi and you’ll get the 79 hits you do want.
- If you want an exact phrase, wrap the phrase in double quotes. For example, searching for Nick Pelling gets 144,000 hits (any page containing the two words will do) – but searching for “Nick Pelling” will give you a mere 9,250 hits. (Lazy hack: you can usually omit the final double quotes, Google is smart enough to fill them in for you.)
Basically, if you know that a given (fairly rare) search term is correct, you’re normally better off preceding it by ‘+’, to ask Google not to get in the way. Of course, leave out the + if you’re not 100% sure!
Note that these two tricks overlap: if you Google for (the doubly-misspelt) cypher mistery, the top result is for cipher mystery (i.e. Google suggests corrections to both words) – but if you search for “cypher mistery” (i.e. the same word pair but in quotes), Google only suggests web-pages with one change to the pair of words.
Google’s 3rd gear – Success-fu
A recurring problem is how to deal with the vast number of pages returned (even with 2nd gear Google-fu): and with just the one lifetime at your disposal, how could you ever sensibly go through a million hits? Of course, you can’t: but here are some neat Google trickettes to help you when your search query has proved, errrm, too successful:-
- If there is some unrelated idea that is diluting your search results, add a word associated with that secondary strand to your search but precede it with ‘-‘. For example, if you want to search for Voynich but don’t want any hits related to the Broken Sword computer game (written by Charles Cecil’s company Revolution Software), you could search for Voynich -Revolution. For bonus Google-fu points, try excluding multiple things at the same time, such as Voynich -Revolution -Ethel -ufo
- Use “100 results per page” as your default Google preference. The “Page Down” button (or, more likely these days, “mouse scroll-wheel down”) is a quick way of browsing 10x more results than you would otherwise get. OK, it’s not ideal, but any half-decent researcher should be capable of speed-reading, surely?
In short, being able to use ‘+’, ‘-‘ and double-quotes effectively is a good practical starting point for would-be Googlers. Note: while it used to be the case that Google’s engine caused these mechanisms to interfere with each other (specifically, you used not to be able to search for quoted strings and excluding search terms at the same time), these days they seem to have sorted all that out. Just in case you run into some outdated information on the web! (As if…)
Google’s 4th gear – Refinement-fu
Let’s say you’d like to craft a search query to yield a manageable set of results – say, 50 or 100 hits. But what do you do if your ‘vanilla’ two word search gets a million hits, but an exact phrase search gets only 2 or 3 hits? How can you coax Google into returning a more useful number of hits?
- The OR operator (in caps) lets you merge pairs of search words. Rather than search for Sirtori telescopium and then search for Sirturi telescopium, you can search for Sirtori OR Sirturi telescopium – much more useful. If you’re after bonus Google-fu points here, try using multiple ORs in the same search, such as Sirtori OR Sirturi telescopium OR telescope
- Number ranges have their own merging trick! If you separate two numbers by two dots (i.e. 2006..2008), Google will find you pages containing any number in that range (though note that this doesn’t work with negative numbers, maths fans). A nice example is that searching for Voynich “500..700 ducats” will dig up references both to 600 ducats (Marci) and to 630 ducats (Dee) – pretty neat!
- The ‘*’ operator lets you find documents containing a pair of words separated by one (or two) words. This can be useful when you’re searching for two words that are connected but which don’t usually appear exactly next to each other. For example, if you wanted to find my middle name, Googling for Nick * Pelling returns pages with Nicholas John Pelling – here, note that because I didn’t specify +Nick, Google silently converts it to Nicholas. Also, note that you can progressively weaken the link by adding more stars in a line, but only if you put them inside double quotes – so, “Nick ** Pelling” and “Nick * * * Pelling” will all find pages where the two words appear progressively further apart (however, “Nick * * * * Pelling” won’t work, sorry!)
Basically, once you can consistently use your refinement-fu to control Google, you’re not coping with search results any more… you’re managing them.
Google’s 5th gear – Zigzag-fu
This is a hard one to describe, but as it defines a gear change all of its own, it needs its own section.
The big takeaway from the preceding gear-fu should be that the point of searching is not to find the perfect page, but rather to find a sensible range of pages clustered around the perfect page – while Google is pretty good at getting you close, you still need to be actively exercising a fair bit of choice if you’re going to find what you want. The skill lies in crafting queries that get you reasonably close (but not too close) to where you want to go.
However… in practice, the whole process doesn’t usually work out quite as well as you would hope – you can’t always “just get closer”, shaving 1,000,000 hits to 100,000 to 10,000 etc. The noble art of “zigzag-fu” involves constructing queries that iteratively zigzag you towards your final query – too many results is bad, too few results is bad, and too spammy / too general a set of results is also bad.
Zigzag-fu is where you build up a feeling for what you’re looking for (even if you haven’t seen it before), and somehow move around it and towards it without really realizing how. People with great zigzag-fu get to where they want to without really thinking – but as this is more of a craft skill, I’m struggling a bit to explain it.
Just practise – I’m sure you’ll get there yourself (if you’re not already there, of course). 😉
Google’s 6th gear – Operator-fu
Google has a sprawling set of obscure “operators” (you can usually recognize them by their trailing colon) for refining searches according to different aspects of the pages found. Having said that, in most cases these are usually only marginally useful – the big trick is realizing when you’re in a big enough hole that only a special-purpose Google crane can hoist you out. “Operator-fu”, therefore, isn’t so much a refined sense of power as a refined sense of danger – i.e. has your search floundered?
- site: – this operator filters out only those pages whose website name (partially) matches the pattern. So, if you only want to find Voynich pages on US university websites, searching for site:.edu Voynich should do the job. The OR operator works on this, so searching for site:.edu OR site:.ac.uk Voynich will find Voynich pages on US and UK university webpages. You can also use this to see how many pages Google has indexed from a given site: for example, searching for site:ciphermysteries.com yields about 613 results (as of today).
- intitle: / inurl: / intext: / inanchor: / allintitle: / allinurl: / allintext: – these tell Google where to look (and, conversely, where not to look) for the keywords you specify. So, searching for allintitle: Voynich Decoded will list all the webpages in Google’s index that contain the words “Voynich” and “Decoded” in the title. Not very useful, but might possibly save the day.
- filetype: – if you are trying to find (say) a pdf containing the phrase “chilled monkey brains”, then Googling for filetype:pdf “chilled monkey brains” should work OK. There are also a load of obscure Google filetypes (such as htpasswd), but that’s a story all to itself. 🙂
- date: – very useful for finding things within the last N months. Not very useful otherwise. 🙂
- daterange: – very useful for finding things within a range of dates. Sometimes a big help!
- The tilde (‘~‘) operator forces Google to look for synonyms, even when it doesn’t itself think the word is ambiguous. However, this isn’t really very useful as (by and large) Google guesses right.
For more on these (and other mad Google operators), there’s a nice guide on the Google Guide site.
Google’s higher gears – Ninja-fu
(OK, OK, I know it’s mashing Japanese and Chinese words together, but I wanted to evoke a feeling of mastery over many worlds – just so you know!) Ascended Google Ninja-fu masters come up with a constant stream of tricks that make just as much use of Google’s sprawling array of secondary search apps (half of which the GooglePlex’s Borg mind has probably forgotten about) and its business model. There’s also a 2003 O’Reilly book called Google Hacks, most of which is now out of date, but which should arguably be given to ten-year-olds with their first proper laptop. 🙂
But to such a 33rd Scottish Rite Googler as yourself, it should be clear by now that everything Google does and has is fair game. Here are just a handful of things to consider, from an insanely long list:-
- Google lets you search for ampersand and underscore characters (maybe it’ll help one day).
- Google doesn’t match search phrases over paragraph boundaries (that’s just the way it works).
- Google knows about C++ and C# (helpful for programming searches)
- You can search for stopwords (such as ‘the’, that Google normally discards) by preceding them with a ‘+’. Though some searches (such as for The Who) do automatically include them!
- PageRank dominates short query strings, context dominates long query strings. If you can decide whether PageRank is helpful or unhelpful for your query, you can adjust your query length accordingly.
- Google API-based tricks – too many to list
- Google Trends-based tricks – too trendy to list
- Google Widget-based tricks – too new to list
- Google’s cache, calculator, weather, currency, recipe, flight information… you get the idea!
Of course, if I disclosed these kinds of secrets, I would be hauled in chains before the New World Order’s special blogging oversight committee and thoroughly excoriated (and I like my corium just the way it is, thank you very much). Besides, because Google changes all the time, so does the array of useful higher-gear tricks – and so you’ll be unsurprised to find out that the real art of being an Ascended Master of Google-fu is… making up your own tricks.
Enjoy! 🙂
Very interesting stuff– I think I’ll find the + trick particularly useful. (Also, I wasn’t aware that “excoriated” had that etymology or that “corium” was even a word.)
Excellent Nick!
I have used many of the ‘higher gear’ options,
but the most important one (using the + )
was new to me. Very helpful indeed!!
Hi Rene & Emily,
As an aside, the top three hits when Googling for Rene are to Rene Magritte, Rene Descartes, and Renee Zellweger. That happens even though +Rene yields 36,000,000 hits and +Renee yields 23,000,000 hits!
So, even if you think you are using a common word (and with 36,000,000 pages, Rene is hardly a rarity, it would seem), you can easily be deceived. From a user-interface point of view, what I find most annoying about this is the lack of transparency – that Google omits to tell you that it has treated a search term as ambiguous. I guess this is why so many people don’t know about the ‘+’ trick. 😮
But at least you know now! 🙂
Cheers, ….Nick Pelling….
Thank-fu, Nick 🙂
A splendid article, Nick! Thanks.
Personally I rarely use Google, preferring Ixquick. Ixquick is a metasearch which collates the results of many search engines and doesn’t give you the huge number of hits that Google does. Usually what I want is on the first page. Also, Ixquick is great for privacy, since they don’t keep users’ IP records for over 48 hours!
Bien sûr que tu veux garder ton écorce, what would Pelling be without his peel? 😉
Hi Dennis,
On a besoin de coeur et peau, right?
Cheers, ….Nick Peau-lling….
Pingback: Google-fu 〜英語での検索技術〜 | kenjioh.com
I mentioned this page in Japanese: http://kenjioh.com/2009/08/google-fu-search-skill-in-english/
I was searching for information about skill to use google in English. I was very interested in this “Google-fu” article, so I transfer part of this page into Japanese.
However, these gears above are also good for Japanese Google-fu, not only for english. I wanted knowledge of searching especially in English. I know some methods of searching specialized in Japanese. Do you have any information about English search?
BTW, I really enjoyed transferring this article into Japanese.
Thank you.
Hi Kenji,
Thanks for the prominent link in your post, glad you enjoyed the article! 🙂
Right now, I don’t know of any English-language-specific Google-fu – but if I find any, I’ll send you a message to let you know.
Cheers, ….Nick Pelling….
Hey Nick!
Thanks for these operators and such. I picked up a few that I didn’t know about.
Just curious, have you seen any changes or additions to “hack” the “new google” ?
Hi wo ist der like Button? 🙂 Viele Grüße aus Berlin Matthias
Matthias: somewhere along the line, the “Digg This” WordPress plugin I installed last year stopped working *sigh* – but I’ve fixed it now, so feel free to Digg my pages all you like! 😉 Thanks for dropping by!
Pingback: Collective Consciousness – How Can You Contribute – New Movements Expanding Consciousness | Voice In The Dark
Pingback: Historical Research | Pearltrees
Pingback: No Silver Bullets? « Programmers Stack Exchange Blog
An excellent summary of the more advanced operators to use with Google, thanks.
My only question is: Is there a simpler way to conduct complex searches online e.g. do any of the other search engines offer easier to use functions with equivalent results?
Pingback: Is the Internet making us smarter after all? | geotrek.in
Nick, ^ that’s spam mate .. Jackie and Graham can f.ck up your computer with the links. I do this service for no charge.
Pete: thanks, it got through because it was probably the most plausible-looking spam comment I’ve had submitted in a while. Certainly more plausible than a lot of the Tamam Troll stuff I get here. 😉 But anyway, removed as per your suggestion, thanks! 🙂
my apologies Graham, are they biting over there?
Looking for a post or comment, in which Nick explained how to search for comments by keyword.
When I find how to do it, I’ll find the comment, and know how to do it.
🙂
Nice collection of information. Apparently, I’ve been operating reasonably happily in Google’s 3rd Gear, with a little bit of 6th Gear thrown in here and there. I’m thankful to learn about the additional ways in which I can refine my searches, although 5th Gear still feels pretty vague to me…
Thanks.
Mike: no problem, though I think this page (from 2009) does need a bit of a reboot. 🙂
PS: good luck with your book next year. 😉
(realising that this is an old one) someone briefly touched on it earlier, and I’m not sure if it’s what they mean, but it’s something that might be value adding/clarifying.
Mixing the 3rd and the 6th rule means that you can filter out sites that you’re not interested in. So if you search for something and most of the first page seems to be links for different pages on one site (we’ll call it fred.com) you can add -site:fred.com
I can’t remember a good example of when this sort of thing happens (maybe filtering out particular forums) but I know I’ve relied on it A LOT.
The + doesn’t seem to work anymore….
Rene: that’s correct, and there’s a whole load more new Google-fu that has come in as well…
Nick: Yet again another interesting and useful post. I knew some of this, but other things were new to me.
I thought I would list keywords, relevant to my research, I have compiled for anyone interested now or in the future(I am sure I will add more to my list in the future). Search terms for the following languages are included: Italian, Latin, English, French, German, Spanish. There may be useful sources in other languages, but these seem to be the most obvious languages to find references in. If anyone wishes to suggest corrections to the listed keywords or other keywords relating to my research into the ciphers of Filippo Maria Visconti’s Milan. I also found using the names of known sources as search terms very useful as it tends to find texts which reference the known sources. I am very far from exhausting the searching that can be carried out using these search terms. Here’s the list->
“cryptologia”
“cypher”
“cipher ledger”
“despatches”
“Duke Philip Maria”
“cipher key”
“zifera”
“zifere”
“zyphera”
“zyphera”
“duca Filippo”
“duca Philippo”
“duca de/di Milano”
“chiave/i di cifra”
“cifrario”
“scritture segrete/segreta”
“criptografiche”
“scritto crittografico”
“crittografia”
“codificare/codifica/codi/codificato”
“nel codice/in codice”
“scrivere oscuro”
“ciffra”
“cifre”
“cifrante”
“intercetta”
“zifre”
“chiavi di cifra”
“messaggio”
“lettera/messagi/dispacci”
“decifrazione”
“decifrazioni”
“in cifra”
“cifrato”
“registro”
“carteggio”
“codici segreti”
“scritture occulte”
“inventari/cataloghi”
“archivio”
“cifratura”
“cifrista”
“quattrocento”
“primo quattrocento”
“ziffrette”
“zifferatas”
“cipharis”
“cyfris”
“ziferis”
“Philippi Mariae Vicecomitis ducis Mediolani”
“chiffre/chiffrer/dechiffrees”
“crypter”
“duc”
“Philippe Marie”
“verschlüsseler/verschlüsseln”
“entziffern”
“entschlüsseln”
“kryptographie”
“geheimschrift”
“Herzog Philip”
“Mailand”
Español: “Felipe Maria Visconti”
“Meister”
“Die Anfange”
“Cerioni”
“La diplomazia”
“Sacco”
Nick: There are a couple of things I have been thinking about.
1) It would be useful if I could define search strings, rather than having to copy and paste long strings each time.
So for example->
CifraTerms = “decifrazioni” OR “in cifra” OR “cifrato” OR “crittografia”
LetteraTerms = “lettera” OR “messaggi” OR “dispacci”
Then Google: “Visconti” AND LetteraTerms AND CifraTerms
There are many examples where this would be very useful. I find it annoying that words in Latin have many forms and it would be nice to cover all these forms in a search.
2) I wonder if there is a case for coding up a script for searching; though I am not sure if this is over the top.
It would be interesting to know what you think.
Spellings are infuriating as the number of potential ways of spelling the same name or even word sometimes makes googling a nightmare. Forgetting foreign languages, medieval latin ways of spelling certain names or words can be so varied. I know spelling wasn’t standardised and latin words could have many different endings, but even so it sometimes feels ridiculous the number of ways of spelling the same name or word.
As might have been guessed here, I have just discovered another very slightly different spelling of a name that I know very well and which does produce results when searching; how many other very slightly different spellings there are I do not know.
My point above about the value of using search strings stands.
A search string probably wouldn’t be enough, one needs a lookup table in the case of some words.