I know grandmascrotum is looking for a second opinion, so I should really keep my mouth shut, but here's my take anyway:
http://googlewebmastercentral.blogsp...e-content.html
Written by Adam Lasnik, a Googler - PR guy but IMO reliable. I would not trust what's posted about duplicate content on WMW.
Quote:
During our crawling and when serving search results, we try hard to index and show pages with distinct information. This filtering means, for instance, that if your site has articles in "regular" and "printer" versions and neither set is blocked in robots.txt or via a noindex meta tag, we'll choose one version to list. In the rare cases in which we perceive that duplicate content may be shown with intent to manipulate our rankings and deceive our users, we'll also make appropriate adjustments in the indexing and ranking of the sites involved. However, we prefer to focus on filtering rather than ranking adjustments ... so in the vast majority of cases, the worst thing that'll befall webmasters is to see the "less desired" version of a page shown in our index.
|
Summary of his advice:
- block appropriately
- Use 301s
- Syndicate carefully
- Minimize boilerplate repetition. "For instance, instead of including lengthy copyright text on the bottom of every page, include a very brief summary and then link to a page with more details."
- Avoid publishing stubs (what I call "thin pages" - e.g. a blog category page with just one post)
- Understand your CMS. "Make sure you're familiar with how content is displayed on your Web site, particularly if it includes a blog, a forum, or related system that
often shows the same content in multiple formats."