Stay organized with collections
Save and categorize content based on your preferences.
Tuesday, March 06, 2007
Search engine robots, including our very own Googlebot, are incredibly polite. They work
hard to respect your every wish regarding what pages they should and should not crawl. How can
they tell the difference? You have to tell them, and you have to speak their language, which is
an industry standard called the
Robots Exclusion Protocol.
Dan Crow has written about this on the Google Blog recently, including an introduction to setting
up your own rules for robots and a description of some of the more advanced options. His first
two posts in the series are:
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],[],[[["Search engines, like Googlebot, utilize the Robots Exclusion Protocol to understand which parts of your website they should and should not crawl."],["You can control search engine access to your website by creating a robots.txt file that uses this protocol."],["Google provides resources and documentation to help you understand and implement the Robots Exclusion Protocol, including blog posts and help center articles."],["Dan Crow's blog posts offer insights into setting up robots.txt rules and using advanced options for controlling search engine behavior."],["Google has previously published articles on topics like debugging blocked URLs, Googlebot's functionality, and using robots.txt files."]]],[]]