Greenguy's Board

Greenguy's Board (http://www.greenguysboard.com/board/index.php)
-   Blogs and Blogging (http://www.greenguysboard.com/board/forumdisplay.php?f=8)
-   -   Blogs and Duplicate Content in Google (http://www.greenguysboard.com/board/showthread.php?t=39559)

Ms Naughty 2007-04-10 08:01 AM

Blogs and Duplicate Content in Google
 
I've been delving into the rather scary, complicated and insane world of SEO the last few days and it's been a bit of a roller coaster ride, to say the least.

One of the big things I've been looking at is duplicate content, because we all know it's something that Google hates.

Today I read that some people are concerned about the amount of duplication that occurs in Wordpress thanks to category, date and tag pages.

http://www.yellowhousehosting.com/re...googles-index/
http://www.webmasterworld.com/google/3097706.htm

Various solutions were offered. One was to make sure index, cat and date pages were only partial so that only the posts themselves were unique.

Someone suggested putting this code in the header:
HTML Code:

<?php if(is_home() ¦¦ is_single() ¦¦ is_page()){
echo ‘<meta name="robots" content="index,follow">
’;
} else {
echo ‘<meta name="robots" content="noindex,follow">’;
}?>

Another person talked about 301 redirects for pages.

And then someone else said: "I have heard Google handles wordpress out of the box with no problem. I can't speak from experience since all my blogs are small. Matt Cutts uses Wordpress. Search for the the character that Matt dressed up as last year. Matt seems to rank ok for that term."

What do the good bloggers here think about this?

Ms Naughty 2007-04-10 08:13 AM

This page sums up the theories on this
http://www.seoresearcher.com/how-to-...ntent-safe.htm

walrus 2007-04-10 01:56 PM

I don't know shit but here goes my take.

First, while I do believe there is a duplicate content penalty, I think most people get too excited when they here about such things. I pretty much posted my feelings on it here http://www.greenguysboard.com/board/...7&postcount=44

Content is also defined by more than just the text and images. Presentation also fits into content.

With date, category and tags you are sorting your posts in a manner that is actually beneficial to the surfer, Your allowing the information to be presented to the surfer in the manner he is most interested in. In other words you are enhancing the surfer experience. I find it only logical that Google et al would find that a positive thing rather than penalizing you for it.

Finally...damn I forgot what finally was. If I remember I guess I'll just post again.

Ms Naughty 2007-04-10 06:21 PM

You're right of course, Walrus. In theory, sites that are good for users should rank well in Google.

There's actually a post in that webmaster world thread that says:
"Sigh. Here we are, building sites for Google rather than surfers."

Halfdeck 2007-04-11 01:27 AM

I know grandmascrotum is looking for a second opinion, so I should really keep my mouth shut, but here's my take anyway:

http://googlewebmastercentral.blogsp...e-content.html

Written by Adam Lasnik, a Googler - PR guy but IMO reliable. I would not trust what's posted about duplicate content on WMW.

Quote:

During our crawling and when serving search results, we try hard to index and show pages with distinct information. This filtering means, for instance, that if your site has articles in "regular" and "printer" versions and neither set is blocked in robots.txt or via a noindex meta tag, we'll choose one version to list. In the rare cases in which we perceive that duplicate content may be shown with intent to manipulate our rankings and deceive our users, we'll also make appropriate adjustments in the indexing and ranking of the sites involved. However, we prefer to focus on filtering rather than ranking adjustments ... so in the vast majority of cases, the worst thing that'll befall webmasters is to see the "less desired" version of a page shown in our index.
Summary of his advice:

- block appropriately
- Use 301s
- Syndicate carefully
- Minimize boilerplate repetition. "For instance, instead of including lengthy copyright text on the bottom of every page, include a very brief summary and then link to a page with more details."
- Avoid publishing stubs (what I call "thin pages" - e.g. a blog category page with just one post)
- Understand your CMS. "Make sure you're familiar with how content is displayed on your Web site, particularly if it includes a blog, a forum, or related system that often shows the same content in multiple formats."

walrus 2007-04-11 01:53 AM

Quote:

Originally Posted by grandmascrotum (Post 341880)
You're right of course, Walrus. In theory, sites that are good for users should rank well in Google.

There's actually a post in that webmaster world thread that says:
"Sigh. Here we are, building sites for Google rather than surfers."

To be totally honest, I've given up on the thought of optimizing any of my blogs for the SE's and concentrated on trying to build blogs that I like and would visit.

If any of the SE gods like it fine, if not fuckem. I sleep at night and not dream of what the next change of the SE gods wind my bring.

NobleSavage 2007-05-04 06:00 PM

I use a varation of that:

Code:


if ( $paged < 2) {

echo '';
} else {
echo '';
}} else {
echo '';
}?>

If it helps - I don't know.

twn 2007-05-07 08:23 AM

I wouldn't worry too much about duplicate content, i have a vbulletin board just like this board which has duplicate content virtually everywhere. From the archive to the single forum post view. When google couldn't detect good and bad duplicate content it would ban all vbulletin boards. But it seems it doesn't so google must be quite smart.

walrus 2007-05-07 09:08 AM

Here are a couple different articles on duplicate content proofing your WP blog.

I like this method (no plug-in) but this article links to a WP plug-in which others may like.

Darrell 2007-05-08 07:33 AM

I just use a robots.txt to tell the SE's not to index the category and archive pages so just the post pages get indexed. Seems to work fine for me.

bDok 2007-05-08 10:24 AM

Quote:

Originally Posted by Darrell (Post 346490)
I just use a robots.txt to tell the SE's not to index the category and archive pages so just the post pages get indexed. Seems to work fine for me.

On my splog networks this is what I did and it really helped to drive the amount of indexed pages through the roof. |thumb

Cheers,
B

Darrell 2007-05-08 11:05 AM

Quote:

Originally Posted by bDok
On my splog networks this is what I did and it really helped to drive the amount of indexed pages through the roof.
Now that's interesting as I had the same happen to me. My sites that have a robots.txt file get more pages indexed and they get indexed quicker than sites without one.

hony 2007-05-08 10:30 PM

The way I see it this is such a common problem that Google has to be able to handle it. Although they would *like* the "one url = one piece of information" idea to be real we and they know that in real life one piece of information may end up on many urls.

So they filter (as stated above) -- Our only concern as webmasters should be that we are not one of the ones filtered out. so using text that exists on another site is not wise, we are likely to be filtered out, but on our own sites? Who cares -- one of our pages will show, that'll do just fine for me.


All times are GMT -4. The time now is 06:30 PM.

Powered by vBulletin® Version 3.8.1
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
© Greenguy Marketing Inc