Today, Google, Yahoo!, and Microsoft have come together to post details of how each of them support robots.txt and the robots meta tag. While their posts use terms like “collaboration” and “working together”, they haven’t joined together to implement a new standard (as they did with sitemaps.org). Rather, they are simply making a joint stand in messaging that robots.txt is the standard way of blocking search engine robot access to web sites. They have identified a core set of robots.txt and robots meta tag directives that all three engines support:
Google and Yahoo! already supported and documented each of the core directives and Microsoft supported most of them before this announcement. In their posts, they also list the directives they support that may not be supported by the other engines.
For robots.txt, they all support:
- Disallow
- Allow
- Use of wildcards
- Sitemap location
For robots meta tags, they all support:
- noindex
- nofollow
- noarchive
- nosnippet
- noodpt
With this announcement, Microsoft appears to be adding support for the use of * wildcards (which will go live later this month) and the Allow directive. The biggest discrepancy is with the crawl-delay directive. Yahoo! and Microsoft support it, while Google does not (although Google does support control of crawl speed via Webmaster Tools ).
This isn’t the first time the major search engines have come together for an announcement regarding how they support publishers. In late 2006, all three joined together to support XML Sitemaps and launched sitemaps.org, followed in April 2007 with support for Sitemaps autodiscovery in robots.txt and in February 2008 with more support for more flexible storage locations of Sitemap files. In early 2005, the engines declared support for the nofollow attribute on links (in an effort to combat comment spam).
Extracted from Search Engine Land, written by Vanessa Fox, 3 June 2008

