Semantic Data, Schema.org & The Future of Search
by Pete Wailes on June 20, 2011
“My, but we’ve come a long way”, we’ll say on the day when Google’s list of links finally disappears. And that day will come sooner than many think.
Over the past eight or so years that I’ve been working in the search industry, I’ve seen a lot of changes. Google News & Froogle (what was to become the Shopping search interface) had only recently launched, Google’s entire index was less than 6 billion pages, there was no Gmail, no mobile search, YouTube, Facebook, Bing was MSN Search and powered by Looksmart & Inktomi, Yahoo! was powered by Google’s technology…
More interesting though has been the lack of innovation in result UI. Oh sure, we’ve got much richer results now than we’ve ever had before, and the underlying technology is far in advance of what it was then, but in terms of how we actually deliver results, I’m not so sure.
A Future Interface
Let me clarify. Based on some recent comments by people at both Google and Microsoft, with regards to answering search queries, the interfaces of the future clearly aren’t going to look like they are now. Instead, they’re going to focus far more on actually answering the users question. We’ve seen the start of this with Google’s recipe search, and Bing’s travel search products.
However, these are just the beginnings of a greater shift in how we interact with the great database that is the Internet. For a more complete understanding, we rather strangly, have to turn to the world of TV game shows.
Search? It’s Elementary My Dear Watson
Earlier this year, Watson, a supercomputer built by IBM, trounced the two greatest human Jeopardy! players at their own game. Much like a modern web search engine, Watson runs thousands of algorithms symulatniously to actually calculate the correct answer to a question. Now, this is fine for where there is an actual answer (questions like ‘what is the’, ‘in what year did’, ‘where can you’ etc), but for ones where a user decision is required, we need to look beyond this.
At this point, we get in to the idea of a twin-structured search engine. In the first part, it’d simply attempt to answer a question presented to it. We can already see this done, if you ask an engine what the time is in a certain place, what a cinema is showing today, or if you want an answer to a calculation. It’s simply an extension (albeit a huge one) of technology that’s already in place.
In this particular area, SEO as we know it will die. Google will simply parse the question and deliver the answer. No links involved.
The second area though, where the user needs to decide based on information, is quite different. This is where the semantic web truly comes in to its own.
The semantic web is a fairly old idea, the crux of which is that one day, all the data on the web will be understandable by machines. To kick-start this, Google, Bing and Yahoo! recently announced the launch of schema.org, a protocol similar to XML sitemaps (but with far broader scope) in that it aims to get the entire web marked up in a way that will facilitate this.
In this new web, a search engine would be able to grab any piece of data from any website, understand it, and then use it to produce better answers for the user. So if I were to type in ‘best small family car’, my results page would show me various small family cars, ratings by various associations, new & used prices, ancilliary information (videos, image galleries etc), and links to places to go to buy one.
This offers an exciting possibility for consumers – instant, well presented information on any topic, with the option to go out and view the original source information, with greater expansion on the subject if required. Think of it like an uber-Wikipedia. For a live example of something like this working, take a look at this results page for ‘yoga poses’ in Bing.
Welcome to the Jungle
Now, for the record, I don’t know what Microsoft or Google’s intentions are. But it’s increasingly clear that if they wanted, this is a direction that they could move in. With their increasingly titanic data stores, they’re in an amazing position to completely transform how we interact with the world’s information. For now though, webmasters need to consider three things:
- Marking up your data probably won’t help your rankings in any particular area at the moment
- Not marking up your data almost certainly will stop you ranking in different forms of search interface in the future
- The websites that act now will, as always, be better placed when change comes along
So do you need to worry about getting your data marked up today? No, but have it in the back of your mind, and make sure you do it sooner rather than later.