robots txt to prevent crawling of freesites
OK,
I have a guy submitting that on his root robots.txt has the following:
User-agent: *
Disallow: /gall/
Disallow: /gall1/
Disallow: /gall2/
Disallow: /gall3/
Disallow: /gall4/
Disallow: /gall5/
Disallow: /gall6/
Disallow: /gall7/
Disallow: /gall8/
Disallow: /gall9/
Disallow: /gall10/
Disallow: /gall11/
Disallow: /gall12/
Disallow: /gall13/
Disallow: /cgi-bin/
Disallow: /img/
domain: soccerwank.com (also sexcarrot.com with different directory names his freesites are in)
On the freesites themselves, in the head, is the meta:
meta name="robots" content="index, follow"
I was running a link checker and was getting flags with this message:
"The link was not checked due to robots exclusion rules. Check the link manually."
Hence me looking at the root robots file.
Seems very fishy to me, and this is titled 'possible cheaters' but I can't fathom whether this is an honest mistake, as obviously his freesites aren't going to get pickjed up by the SES, or just a way to glean traffic from LLs.
|