Greenguy's Board - View Single Post - robots txt to prevent crawling of freesites

Jel · 2006-09-16, 02:28 AM

OK,

I have a guy submitting that on his root robots.txt has the following:

User-agent: *
Disallow: /gall/
Disallow: /gall1/
Disallow: /gall2/
Disallow: /gall3/
Disallow: /gall4/
Disallow: /gall5/
Disallow: /gall6/
Disallow: /gall7/
Disallow: /gall8/
Disallow: /gall9/
Disallow: /gall10/
Disallow: /gall11/
Disallow: /gall12/
Disallow: /gall13/
Disallow: /cgi-bin/
Disallow: /img/

domain: soccerwank.com (also sexcarrot.com with different directory names his freesites are in)

On the freesites themselves, in the head, is the meta:
meta name="robots" content="index, follow"

I was running a link checker and was getting flags with this message:
"The link was not checked due to robots exclusion rules. Check the link manually."
Hence me looking at the root robots file.

Seems very fishy to me, and this is titled 'possible cheaters' but I can't fathom whether this is an honest mistake, as obviously his freesites aren't going to get pickjed up by the SES, or just a way to glean traffic from LLs.

2006-09-16, 02:28 AM	#1
Jel I'm the only guy in the world who has to wake up to have a nightmare Join Date: Feb 2004 Location: London, United Kingdom Posts: 1,895	robots txt to prevent crawling of freesites OK, I have a guy submitting that on his root robots.txt has the following: User-agent: * Disallow: /gall/ Disallow: /gall1/ Disallow: /gall2/ Disallow: /gall3/ Disallow: /gall4/ Disallow: /gall5/ Disallow: /gall6/ Disallow: /gall7/ Disallow: /gall8/ Disallow: /gall9/ Disallow: /gall10/ Disallow: /gall11/ Disallow: /gall12/ Disallow: /gall13/ Disallow: /cgi-bin/ Disallow: /img/ domain: soccerwank.com (also sexcarrot.com with different directory names his freesites are in) On the freesites themselves, in the head, is the meta: meta name="robots" content="index, follow" I was running a link checker and was getting flags with this message: "The link was not checked due to robots exclusion rules. Check the link manually." Hence me looking at the root robots file. Seems very fishy to me, and this is titled 'possible cheaters' but I can't fathom whether this is an honest mistake, as obviously his freesites aren't going to get pickjed up by the SES, or just a way to glean traffic from LLs.