Sunday, May 4, 2008

Google Sets are Actually Pretty Cool

Peter Norvig,, Director of Search Quality for Google, stopped by the Emerging Technology Group in November. He gave a talk that rambled a little bit, but could fairly be described as “Google’s point of view on various programming language related things.”
Good talk. He said a lot of interesting things.
One of the things he talked about was Google sets. From his point of view, the important thing was was this:
Every now and then someone will stand up and say “We need to have large scale, established and standardized ontologies in order to make sense of the information on the web.”
But such a categorization is already encoded in the structure of the web, and Google is already using it.
In fact, the added benefit from such resource-intensive ontology development and characerization is not readily apparent.
Which leads us to Google sets. Google sets is a way to browse the web’s implicit ontology. What you do is simple: you enter some terms which you already think of as instances of some class. Google then returns you what it thinks are the other instances of that class.
Keep in mind that some of the instances it returns are, well, bizarre. Enter the terms “narwhale” and “rhino” and the response list includes, in addition to animals and record companies, the entry “Mailing Lists” (no doubt there’s a reason for this. I just don’t know what it is).
Playing around with google sets is inherently fun. But it’s also a great way to mine the web. For example, here’s a great use. Enter the name of the company you work for. Enter the name of one of its main competitors. Click “small set.”
Voila! There’s how the web classifies your company.
Tell me another great use for google sets.

No comments: