Monday, 16 February 2009

Leading search engines combine to clean up results

The New York Times reports on a move by the 3 main search engines - Google, Yahoo and Microsoft - to clean up the amount of 'clutter' on the web by creating a new web standard that will allow website publishers to remove duplicate pages from their sites. This should allow the search engines to remove lots of duplicated or 'dead' pages from their indexes to make them more efficient and potentially more comprehensive.

This cooperation between the search engines follows the previous standards developed for the sitemap protocol and this time targets those large dynamic websites (such as e-commerce stores) that generate multiple URLs that all point to the same page. This effect can confuse the search engine 'spiders' that are trawling the web and lead to the indexing of the same pages multiple times. Some estimates claim that as much as 20% of URLs on the web may be duplicates, although this is possibly on the high side.

Google has lead the way with this move, providing website owners the chance to indicate when a URL is a duplicate, and if so, which is the principal, or “canonical,” URL that search engines should be indexing. Yahoo and Microsoft have agreed to support the same standard. This new Canonical Link Tag, as the standard is known, should make it easier for both publishers and search engines to address the problem, but of course the most important thing is to make web publishers aware of this and to give them the incentive to add the tag to their pages.

Labels: , , ,


Post a Comment

Subscribe to Post Comments [Atom]

<< Home