Monday, September 14, 2009
Most of the recommendations we've made in the past are for individual webmasters running their own websites. We thought we'd offer up some best practices for websites that allow users to create their own websites or host users' data, like Blogger or Google Sites.
-
Make sure your users can verify their website in website management suites such as
Google's Webmaster Tools.
Webmaster Tools provides your users with detailed reports about their website's visibility in Google. Before we can grant your users access, we need to verify that they own their particular websites. Verifying ownership of a site in Webmaster Tools can be done using a custom HTML file, ameta
tag, or seamless integration in your system via Google Services for Websites. Other website management suites such as Yahoo! Site Explorer and Bing Webmaster Tools may use similar verification methods; we recommend making sure your users can access each of these suites. -
Choose a unique directory or hostname for each user.
Webmaster Tools verifies websites based on a single URL, but assumes that users should be able to see data for all URLs 'beneath' this URL in the site URL hierarchy. See our article on verifying subdomains and subdirectories for more information. Beyond Webmaster Tools, many automated systems on the web—such as search engines or aggregators—expect websites to be structured in this way, and by doing so you'll be making it easier for those systems to find and organize your content. -
Set useful and descriptive page titles.
Let users set their own titles, or automatically set the pages on your users' websites to be descriptive of the content on that page. For example, all of the user page titles should not be "Blogger: Create your free blog". Similarly, if a user's website has more than one page with different content, they should not all have the same title: "User XYZ's home page". -
Allow the addition of tags to a page.
Certainmeta
tags are reasonably useful for search engines and users may want to control them. These include tags with thename
attribute ofrobots
,description
,googlebot
,slurp
,msnbot
. Click on the specific name attributes to learn more about what these tags do. -
Allow your users to use third-party analytics packages such as Google Analytics.
Google Analytics is an enterprise-class analytics software that can run on a website by just adding a snippet of JavaScript to the page. If you don't want to allow users to add arbitrary JavaScript for security reasons, the Google Analytics code only changes by one simple ID. If your let your users tell you their Google Analytics ID, you can set up the rest for them. Users get more value out of your service if they can understand their traffic better. For example, see Weebly's support page on adding Google Analytics. We recommend considering similar methods you can use for enabling access to other third-party applications. -
Help your users move around.
Tastes change. Someone on your service might want to change their account name or even move to another site altogether. Help them by allowing them to access their own data and by letting them tell search engines when they move part or all of their site via the use of301
redirect destinations. Similarly, if users want to remove a page/site instead of moving it, please return a404
HTTP response code so that search engines will know that the page/site is no longer around. This allows users to use the urgent URL removal tool (if necessary), and makes sure that these pages drop out of search results as soon as possible. -
Help search engines find the good content from your users.
Search engines continue to crawl more and more of the web. Help our crawlers find the best content across your site. Allow us to crawl users' content, including media like user-uploaded images. Help us find users' content using XML Sitemaps. Help us to steer clear of duplicate versions of the same content so we can find more of the good stuff your users are creating by creating only one URL for each piece of content when possible, and by specifying your canonical URLs when not. If you're hosting blogs, create RSS feeds that we can discover in Google Blog Search. If your site is down or showing errors, please return5xx
response codes. This helps us avoid indexing lots of "We'll be right back" pages by letting crawlers know that the content is temporarily unavailable.
Can you think of any other best practices that you would recommend for sites that host users' data or pages?