View Single Post
Old 2006-09-16, 02:28 AM   #1
Jel
I'm the only guy in the world who has to wake up to have a nightmare
 
Jel's Avatar
 
Join Date: Feb 2004
Location: London, United Kingdom
Posts: 1,895
robots txt to prevent crawling of freesites

OK,

I have a guy submitting that on his root robots.txt has the following:

User-agent: *
Disallow: /gall/
Disallow: /gall1/
Disallow: /gall2/
Disallow: /gall3/
Disallow: /gall4/
Disallow: /gall5/
Disallow: /gall6/
Disallow: /gall7/
Disallow: /gall8/
Disallow: /gall9/
Disallow: /gall10/
Disallow: /gall11/
Disallow: /gall12/
Disallow: /gall13/
Disallow: /cgi-bin/
Disallow: /img/

domain: soccerwank.com (also sexcarrot.com with different directory names his freesites are in)

On the freesites themselves, in the head, is the meta:
meta name="robots" content="index, follow"

I was running a link checker and was getting flags with this message:
"The link was not checked due to robots exclusion rules. Check the link manually."
Hence me looking at the root robots file.

Seems very fishy to me, and this is titled 'possible cheaters' but I can't fathom whether this is an honest mistake, as obviously his freesites aren't going to get pickjed up by the SES, or just a way to glean traffic from LLs.
Jel is offline   Reply With Quote