Greenguy's Board

Greenguy's Board (http://www.greenguysboard.com/board/index.php)
-   Search Engines (http://www.greenguysboard.com/board/forumdisplay.php?f=13)
-   -   About the robots.txt file (http://www.greenguysboard.com/board/showthread.php?t=46009)

LD 2008-03-12 09:27 AM

About the robots.txt file
 
Assuming I want everything crawled, should I just leave it out completely? Should I use this line in the header:

""

What's the best way to go about this when you want to get crawled?

Maj. Stress 2008-03-12 05:05 PM

One simple text file in the root of your domain is all you need. Use your favorite search engine to find robot.txt tutorial. :) It will eliminate some 404's from your stats too.

LD 2008-03-13 09:37 AM

Thanks...the thing I could not figure out is if I need the need the meta tag in my header if I have the file in my domain. I am not sure if the meta tag is a substitute for the file, or if the file eliminates the need for the meta tag...or if having both is an issue.

Maj. Stress 2008-03-13 06:33 PM

Acorrding to this site the meta tag limiting in comparison to the actual robots.txt file. You can control the spidering of an entire domain from a few lines depending on what your needs are. You can tell bots not to spider /images folders or cgi-bin, etc.

You said you wanted everything spidered. If you look around at some of the lesser used search engines, you might change your mind. It's amazing some of the things that get indexed.

LD 2008-03-13 08:09 PM

Excellent info at that site, thanks!

Halfdeck 2008-03-15 01:21 PM

""

is useless and unnecessary.

-----------------------------

Here's the difference between META ROBOTS tag and robots.txt.

Robots.txt tells Google not to crawl a URL. "Crawl" meaning don't retrieve content from the page.

META ROBOTS noindex tells Google don't show this page in Google's index. For that META tag to do its magic, a URL must be crawled (otherwise Google can't see the META tag).

As far is Google is concerned, there's something else to worry about. URLs blocked by robots.txt disallow accumulate PageRank. If you're having indexing issues, robots.txt is not a great way to handle it. META ROBOTS=noindex also leads to blocked URLs accumulating PageRank. On the other hand, META RObOTS=noindex,nofollow may be treated by Google like a dangling page.

LD 2008-03-15 04:13 PM

Thanks Halfdeck, good info if I can digest it..:)

The issue I am having is that my main index page is being indexed and crawled fine ("domain/index.php" ), but I'm having problems with my cat pages "domain/linklist/oral.php" and so forth. I submitted a site map, and G tells me my pages are indexed, yet when I type the url of most of my catetory pages in the G search box, I get nothing. They do show up in Yahoo and msn, and probably others...but no G.

Perhaps I should just wait a while..? I mean after they are indexed, should I allow for a certain amount of time before they show up in the search? They were not even indexed before I submitted the site map, and that was only a week or so ago.

The interesting thing about all this is my shaved pussy cat DOES show up, yet it gets very little traffic and I have no trades there,,,(?)

Simon 2008-03-15 05:28 PM

off-topic
 
1 Attachment(s)
Quote:

Originally Posted by LusciousDelight
The interesting thing about all this is my shaved pussy cat...


stuveltje 2008-03-16 11:30 AM


Stu----->> just spit her diet coke on her screen when she saw that cat picture........hahahahahaha to funny:D

LD 2008-03-16 03:04 PM



Would make a great avatar...:D

Halfdeck 2008-03-17 01:10 AM

Quote:

when I type the url of most of my catetory pages in the G search box
Usually, its harder to get deeper pages indexed than the home page simply due to lack of PageRank. If you have a ton of outbound links on every page (especially the home page), your site gives away too much juice to other sites instead of pumping that juice to internal pages.

Quote:

Perhaps I should just wait a while..? I mean after they are indexed, should I allow for a certain amount of time before they show up in the search?
Yeah, usually Google will get around to indexing pages sooner or later. How quickly your pages get picked depends mostly on PageRank, but it also depends on other factors, like how often you update your site.

LD 2008-03-17 07:56 AM

Thanks, Halfdeck...that gives me something to go on...


All times are GMT -4. The time now is 04:49 AM.

Powered by vBulletin® Version 3.8.1
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
© Greenguy Marketing Inc