![]() |
robots txt to prevent crawling of freesites
OK,
I have a guy submitting that on his root robots.txt has the following: User-agent: * Disallow: /gall/ Disallow: /gall1/ Disallow: /gall2/ Disallow: /gall3/ Disallow: /gall4/ Disallow: /gall5/ Disallow: /gall6/ Disallow: /gall7/ Disallow: /gall8/ Disallow: /gall9/ Disallow: /gall10/ Disallow: /gall11/ Disallow: /gall12/ Disallow: /gall13/ Disallow: /cgi-bin/ Disallow: /img/ domain: soccerwank.com (also sexcarrot.com with different directory names his freesites are in) On the freesites themselves, in the head, is the meta: meta name="robots" content="index, follow" I was running a link checker and was getting flags with this message: "The link was not checked due to robots exclusion rules. Check the link manually." Hence me looking at the root robots file. Seems very fishy to me, and this is titled 'possible cheaters' but I can't fathom whether this is an honest mistake, as obviously his freesites aren't going to get pickjed up by the SES, or just a way to glean traffic from LLs. |
Also looks like a way to turn recip links (A->B->A) into one-way links from the LLs to his domains. (One-way links being more valuable.)
|
Quote:
he's a member of the board, maybe we'll hear something. |
This isn't the only forum member doing this. Like Jel I'm not sure it's exactly cheating so I would like to hear more opinions on this.
|
Quote:
|
Quote:
Malicious intent falls under the unspoken rule of, "I don't like your business practices, therefore, I don't want to do business with you." In this situation, I probably wouldn't send a rejection email, or even ask what's up. I'd just silently make their sites dissappear with a quick click of the delete button. Jel, thanks for bringing this issue up. As if I didn't already have enough to check for... |loony| |
Methinks folks don't look in this section often enough.
I think I'll send TT a note and tell him to look here; I like him and it surprises me that this kind of thing would be done on purpose. Maybe he's got a good explanation for it. Either way, it certainly isn't in anyone's LL rules that it can't be done, so...? Weird situation. |
Hi.
I am the owner of both soccerwank.com , sexcarrot.com and pornogata.com wich have the same robots.txt files and all the domains. When i first started out building galleries and freesites, i was told to create a robots.txt file like that to prevent google from crawling thousands of duplicate galleryfiles and hundred of duplicated freesitefiles. I am linking to the freesites on my mainsite, but offcourse that would not benefit all the LL i am submitting to. I willl change the robots.txt files on all my domain asap, and prevent from crawling only the galleryfolders instead of the freesite directory folders. I am sorry i have braught up the issue, cuz i was really not aware of it, i just followed some friends good advice. |
And please dont see my as a possible cheater.. i have never even once tried to cheat fellow webmasters with intention.
|
I have now removed the disallow to all directories at these 3 domains.
However, i do know most galleribuilders that have a HUB site are doing the same, but that is not me, so just wanted to tell that my directories are open for spiders now. :) |
Quote:
Quote:
|
Quote:
Thanks to Carrie allso, that braught this thread to my attention. Virgohippy: Quote:
|
Unfortunately picxx you were a victim of bad advice. I'd suggest you no longer listen to the person that gave you this advice in the first place :)
There's a lot of bad information tossed around on boards, etc, and you really have to be careful on who you listen to. This board is a good place to post questions about LL as you won't get steered wrong as you'll be getting the info straight from the horses (owners) mouth. |
Quote:
With my next reincarnation, I'll make it a point to include a "if you don't see your site listed within a couple weeks, and you don't recieve a declined reason, contact me here..." :) But I still refuse to get my hands dirty! |loony| |
Quote:
|
I know LL owners like to get all the backlinks they can get their hands on (so would I), but you're forgetting one thing:
Duplicate content. Getting linkbacks from supplemental pages is not going to do anyone any good. EDIT: Not to mention low quality backlinks from free sites aren't going to make or break your ranking on Google (though MSN probably eats them up). In a few years, who knows, Google may ignore them altogether. One way to look at recips is advertising your LL via increasing brand awareness. Approach them as means of inflating your SE position -- and you're in violation of Google guidelines. Preventing duplicate content is a legitimate reason for disallowing mirrors, however. A large percentage of supps under a domain may negatively impact the entire domain. |
Quote:
BUT make sure your meta tags don't say anything different, that's all. now I know this was an honest mistake but it's a good opportunity for everyone to make sure they check the little details :) |
Quote:
|
Quote:
Therefore whenever my bot ran across his domains all of his free-sites would get flagged and pulled as unavailable and I'd have to re-add them manually. This was incredibly annoying. I actually did this for a while but finally got tired of it and just left his sites in an error/delisted status. He no longer submits to me, but I still see his name pop-up from time to time. |
Quote:
Linking to a number of freesites on a domain with tons of dissallowed pages? or Linking to a number of freesites which may or may not be flagged for spam? Seems to me most submitters aren't able to produce and submit more than a small handful of mirrors anyway. |huh Quote:
Quote:
|
Quote:
Quote:
Quote:
In that thread, I wrote: Quote:
Quote:
|
Quote:
|
Quote:
Mirror sites and robots.txt disallow both lead to your LL likely getting no link juice from recips whatsoever. If you want decent backlinks, you might think about accepting only unique free sites. Even then, if you're linking to each other, chances are the link is completely ignored by Google. |
Quote:
But I see your point. In my own experiments I've noticed that backlinks from unique pages with backlinks from other unique pages gives a much higher return than backlinks from non-unique pages... well, from google at least. ;) |
Quote:
Now, I personally always submitted to a small LL grouping to remove the duplicate page penalty, but if there were more LLs that interested me, I would now do as I stated above. As for whether it's good advice or not to use the robot.txt...if this is setting off people's scripts, if it's too much hassle for them to review your freesites, reviewers will do exactly do as preacher and jel did...they'll most likely not even bother reviewing or listing your sites. So, there's a certain futility in following the robot.txt advice if it prevents you from easily getting listed at the LLs you're submitting to. As a submitter you have the options of either submitting to very few LL's (that was my choice when I was regularly submitting), Change the pages enough to avoid the duplicate page penalty (not hard to do when you're working with a template system), or use the robot.txt and not get listed on a number of sites you're submitting to. Out of those options the most sane and easy one is to submit to a very small high quality group of linklists that you know will list you regularly. You posted something Linkster said and maybe he can stop in and help me out on this point as I will gladly defer to his expertise in this area because I know he knows wayyyy more about this topic than I ever will. In that post you quoted, Linkster says that linksforsex went to a single recip, but a few months ago he went back to the category specific recips and that's still the case today. Now, why would a Link List switch back to category recips? As far as I can tell category recips have mainly been put in place for SEO? I'm not sure what benefit category specific recips would have other then SEO. If LLs are using category specific recips, it seems they're doing so for SEO, if that's the case then it would behoove all involved to make google as happy as they can and take effort to remove the duplicate page penalty, but at the same time not negate the category recips that so many LL's use by using a robot.txt to block the engine from searching those pages. Essentially you are breaking a Link Lists rules because you're completely negating any benefit the Link List owner was trying to get by having category specific recips. |
btw Halfdeck, this is a purely academic debate on my part and I respect your opinions...you may be completely right and I can be wrong on it. On the surface, at least to me, the robot.txt thing seems wrong, but I do understand your points on the topic and I'd be curious to see what others think on it as well.
|
I've generally been of the mind that if its not in the rules you can pretty much do what you want and LL's when they encounter these odd situations (obviously only a few are doing this kind of thing) the LL has the responsibility to keep their listings up after confirming by hand or whatever. I admire Preacher for going that extra mile of rekeying his sites in when his script showed this links were disallowed by robots.txt. Also, there is the real politick that its probably not a good idea if LL's begin just ignoring your sites because you use this technique.
I guess what I became worried about was if picXX was aware of the other topics we were pointing out here and what they mean to him. He obviously is searching and taking advice on "best practices" by using the robots scheme but it seems he may have been frightened into taking it down immediately BEFORE he was able to get his pages in possible order to avoid the duplicate content penalty. Does he clearly understand now that he should go back to his pages and change the text content, alt tags, and metatags on the mirrors etc AND maybe even re-submit some of them?. I hope so. |
Quote:
|
Quote:
Quote:
Quote:
Quote:
My point though, is if you accept mirror free sites, chances are you're getting linked from a supplemental page which does you no good anyway. Also, Google seems to be getting pickier about duplicate content especially from unknown, untrusted, 1 month old domains, so just tweaking the title/meta tag and on-page text may not always be enough to keep a page in the main index. Let me post an example. I have a list of free sites here: http://www.nastyxvids.com/sitemap/ Mind you, I built these free sites before I was even aware of search engines, so this isn't exactly scientific (also, site: search is a bit quirky lately, and you may see something different from what I'm seeing depending on which DC you're hitting). The domain is a little short of 2 years old. Pages listed in Google's main index: http://www.google.com/search?q=site%...en-US:official http://www.google.com/search?hs=6Db&...2F&btnG=Search http://www.google.com/search?hs=7tv&...2F&btnG=Search http://www.google.com/search?hs=duv&...2F&btnG=Search http://www.google.com/search?hs=SaG&...2F&btnG=Search http://www.google.com/search?hs=JGb&...2F&btnG=Search http://www.google.com/search?hs=YbG&...2F&btnG=Search http://www.google.com/search?hs=ewv&...2F&btnG=Search http://www.google.com/search?hs=eHb&...2F&btnG=Search http://www.google.com/search?hs=Hxv&...2F&btnG=Search Most of the LLs I submitted to are getting no link love from my submissions on that domain. ------------------------------------ The way I'd go about free site mirrors now would be this: /index.html /main.html /gallery1.html /gallery2.html /doorway1.html -> links to main.html /doorway2.html -> link to main.html Provided doorway1.html is significantly different from /index.html, and assuming 100s of templates a submitter uses are significantly different from each other, (and assuming 10,000s of submitted free sites are unique enough in terms of on-page text/HTML structure), and assuming further that a submitter build free sites on a one year+ old, trusted, TBPR 3+ domain, there are plenty of unique text (200-300 words+) on each page ..... I think all pages will be indexed as unique pages in Google, and no robots.txt disallow is needed. Still, my main objection would be against tactics aimed at artificially boosting your SE rankings. I wouldn't assume grey hat methods like recips (they're not citations or "votes" with minimal traffic value) will work indefinitely. Quote:
Quote:
http://www.webmasterworld.com/google/3079355.htm Whether he's bluffing or not who knows. I do know Google already detects and kills PageRank transfers on *some* bought links, and I assume the same to be happening with some traded, "made for SE ranking" links. Another relevant quote (Matt Cutts): Quote:
I still do not agree with the mentality "how can I improve/optimize my ranking without getting penalized?" whch seems to be driving this robots.txt discussion. A better question imo would be "how can I make my site more valuable to visitors, and more visible, so more people will find what they're looking for?" Bottom line: I see nothing wrong with blocking duplicate content pages using robots.txt or meta noindex tag - that's commonly recommended SEO practice. A free site submitter doesn't gain PageRank by disallowing / noindexing a page. It only prevents duplicate content from being indexed. Tagging a free site page with NOFOLLOW would send me a different signal (a free site submitter trying to hog PageRank), but that's another issue. P.S. Off topic, but if I ran a LL, I would think about tagging links to free sites with NOFOLLOW, as does Technorati tag pages, which are starting to rank very well on Google. You eliminate the reciprocal linking issue (turn all free site links into one way links), and possible negative trust brought on by linking to supplemental/duplicate content pages on untrusted domains. |
Excellent post Halfdeck, you've really explained the topic perfectly and now I do agree with your stance on it. Just a few things.
Quote:
Quote:
Keyword Something like that? Also, since we're really talking about small link lists and their rules (as big linklist probably don't have to worry about the No Follow rule). Wouldn't it be wiser for smaller link lists to use recips more like TGPs use recips? Sorry, I'm more of a tgp guy so I'm going to explain this in those terms...but for TGPs we use a single recip with almost no hope of getting SE off that recip. What we use it for is more or less getting a percentage of traffic from other tgps that list the same gallery (hopefully getting listed with tgps of equal size or bigger than your own). So the focus of the recip is heavily on branding the domain name / name of the tgp or if it's a niche tgp highlighting the niche quality of your tgp that might draw people off of a general tgp to your site if they have a specific fetish. Shouldn't smaller link lists move to that single recip that heavily brands instead of a category recip? |
Quote:
Quote:
If I had a choice, I'd rather link to fewer sites and send more of my traffic to submitters who make quality sites and unique warning pages. IMO, doing so not only benefits both of us as far as SEO is concerned, but it also maintains a quality brand for my own LL. :) |
Quote:
Keyword A nofollow link is like wearing a condom. |loony| |
Well - I guess I better fess up as to why I went to the cat recips - and it actually had nothing whatsoever to do with SEO - as a matter of fact the submitters will be happy to know that Im switching back - but it will be a text link not a gif file
The reason I did it is that I knew that a certain gif file of mine was in some autosubmitters programs - I used it as a tool to find the autosubmitters (at least the first part of the detective work anyway) and it worked :) I would say that based on what I have experienced and some other LLs have seen, that category recips dont mean shit unless you also go the route of requesting that the recips be on a page already in Google - similar to what one sucessful LL owner did to get some really nice spots - but I have to agree that branding in this day and age is ten times as important (building bookmarkers) than trying to run a LL based on SE's only - it may work for a few months, but then when Google tweaks something you try to fight Google instead of expending the energy on making your brand and getting "return surfer trust" - believe me - that group spends way more money :) /added - I know that putting the submitters through having to do the cat recips was a little extra work ontheir end - but I think the benefit of having less competition for spots on LLs gives a bigger benefit to free site makers |
Quote:
Quote:
However, LL aren't meant to be cheap directories, PageRank boosters, or reciprocal linking networks. LL is a source of traffic, plain and simple. Slapping a nofollow on your category page links *may* result in higher SERP positions, because your incoming links from free sites are all one-way. Technorati /tags/ pages are a good example imo of a site that seems to be gaining SERP positions thanks to massive one-way links (also doesn't hurt that many of these links are from reputable/authority/relevant sites): 52,309 posts currently tagged "seo" (assuming a high percentage of them pointing to technorati with rel="tag" links). Result? http://www.google.com/search?q=seo&s...en-US:official 9th out of 11 million results. If you look at the top of their page: "http://www.technorati.com/tags/SEO" you'll see: (Not a scientific test). John Battelle recently released an interview with Matt Cutts regarding wc3 deciding to use nofollow on all their $1000 links. question: Quote:
http://battellemedia.com/archives/002917.php |
Quote:
Quote:
|
Quote:
I know one big reason I jumped into freesites so strong was because of the SE benefits. |loony| |
It's gotten to the point where the "SE benefits" (incoming links, quicker spider access coming to your sites) far outweigh the traffic benefits when submitting to LLs. Most of the submitters I know *only* do it now for SE purposes, any sale the site might possibly make is just a bonus.
Used to be you could submit a freesite and make 3 sales the first week. Now you're lucky if you make 3 sales in the 6 months that it's up (and then some LL owners rip it down). The traffic ain't what it used to be... hence the LL owners trading with TGPs, increasing required number of pictures, making pictures bigger, etc. If you take away the SE benefits then there really isn't much reason left to submit anymore, unless you like working your ass off for the possibility of an odd once-in-a-blue-moon signup. The time would be better spent doing other things. |
Quote:
Quote:
|
Quote:
I think another factor in less sales might be the sheer number of sites that the larger LLs have in their archives. The ones that have been around a few years have a few hundred thousand sites in there, most likely. I was never up for the "remove the listing after 3/6 months" (because sometimes the SEs don't even notice you for that long or they sandbox you for a while to see if you're gonna stick around), but I'm starting to think that dumping sites that are over a year old might help the submitters make more sales. There would still be a ton of sites to keep the bookmarkers around looking at the LLs ads, so maybe it wouldn't negatively affect the LL owners to do this. Would love your take on it D. |
All times are GMT -4. The time now is 09:12 AM. |
Powered by vBulletin® Version 3.8.1
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
© Greenguy Marketing Inc