Greenguy's Board


Go Back   Greenguy's Board > Link Lists & Getting Listed
Register FAQ Calendar Today's Posts

Reply
 
Thread Tools Search this Thread Rate Thread Display Modes
Old 2004-10-20, 11:02 AM   #1
aphrophoto
If something goes wrong at the plant, blame the guy who can't speak English
 
Join Date: Aug 2003
Location: Kent, UK
Posts: 33
Link Bots

Im having trouble submitting to quite a few sites, getting 401 errors (authorisation required). The guy who manages the server where my sites are says theres an empty .htaccess file but nothing else that would generate this error. Ive just seen reference to a 'link bot' Does this mean spider and if so (or not) any ideas what my server guy should be looking for ? Thanks
__________________
British Dollars
aphrophoto is offline   Reply With Quote
Old 2004-10-20, 12:15 PM   #2
Greenguy
The Original Greenguy (Est'd 1996) & AVN HOF Member - I Crop Pics For Thumbs In My Sleep
 
Greenguy's Avatar
 
Join Date: Feb 2003
Location: Blasdell, NY (shithole suburb south of Buffalo)
Posts: 41,929
Send a message via ICQ to Greenguy
Post a URL that's going 401 & we'll have a look at it
__________________

Promote POV Porn Cash By Building & Submitting Galleries to the Porn Luv Network
Greenguy is offline   Reply With Quote
Old 2004-10-21, 03:46 AM   #3
aphrophoto
If something goes wrong at the plant, blame the guy who can't speak English
 
Join Date: Aug 2003
Location: Kent, UK
Posts: 33
This is one of them

http://www.adevil.com/tgpweb/F14/b/index.htm

Outlawsporn script said it went 401 and so far out of 120 submissions, 9 said it right away and probably others whenthe owner runs their bot/script.

I do have a paysite on the same webspace (www.adevilsamateurs.com) which has .htaccess for the members area but all my free sites and associated pic's and banners are in the public part of my web space
__________________
British Dollars
aphrophoto is offline   Reply With Quote
Old 2004-10-21, 07:37 AM   #4
Cleo
Subversive filth of the hedonistic decadent West
 
Cleo's Avatar
 
Join Date: Mar 2003
Location: Southeast Florida
Posts: 27,936
Having spaces in your file names isn't helping you.

The HPA on your main page has what looks like clickable thumbs so it go rejected at my LL.

Also it loaded really slowly when I was doing reviews yesterday.
__________________
Free Rides on Uber and Lyft
Uber Car: uberTzTerri
Lyft Car: TZ896289
Cleo is offline   Reply With Quote
Old 2004-10-21, 08:37 AM   #5
cd34
a.k.a. Sparky
 
cd34's Avatar
 
Join Date: Sep 2004
Location: West Palm Beach, FL, USA
Posts: 2,396
Here's what some spiders will do:

telnet adevil.com 80
Trying 67.19.97.178...
Connected to 178.67-19-97.reverse.theplanet.com.
Escape character is '^]'.
GET /tgpweb/F14/b/index.htm HTTP/1.0
Host: adevil.com

HTTP/1.1 401 Authorization Required
Date: Thu, 21 Oct 2004 12:31:15 GMT
Server: Apache/1.3.31 (Unix) mod_auth_passthrough/1.8 mod_log_bytes/1.2 mod_bwlimited/1.4 FrontPage/5.0.2.2634a mod_ssl/2.8.20 OpenSSL/0.9.7a PHP-CGI/0.1b
Connection: close
Content-Type: text/html; charset=iso-8859-1

here's what a good spider should do:

telnet adevil.com 80
Trying 67.19.97.178...
Connected to 178.67-19-97.reverse.theplanet.com.
Escape character is '^]'.
GET /tgpweb/F14/b/index.htm HTTP/1.0
Host: adevil.com
User-Agent: wGet

HTTP/1.1 200 OK
Date: Thu, 21 Oct 2004 12:21:05 GMT
Server: Apache/1.3.31 (Unix) mod_auth_passthrough/1.8 mod_log_bytes/1.2 mod_bwlimited/1.4 FrontPage/5.0.2.2634a mod_ssl/2.8.20 OpenSSL/0.9.7a PHP-CGI/0.1b
Last-Modified: Wed, 20 Oct 2004 12:48:14 GMT
ETag: "28042d-31bb-41765e8e"
Accept-Ranges: bytes
Content-Length: 12731
Connection: close
Content-Type: text/html

Basically, the first request is missing the User-Agent: entry and will receive a 401. The second response with the User-Agent: header indeed returns the content in question.

I tried a few different requests and I am a little unsure exactly what he is blocking, but, most User-Agent's go through, so, it appears he is only blocking blank User-Agent's (Browser id string).
__________________
SnapReplay.com a different way to share photos - iPhone & Android
cd34 is offline   Reply With Quote
Old 2004-10-21, 08:39 AM   #6
Cleo
Subversive filth of the hedonistic decadent West
 
Cleo's Avatar
 
Join Date: Mar 2003
Location: Southeast Florida
Posts: 27,936
How can I see what User-Agent string my bots are sending?
__________________
Free Rides on Uber and Lyft
Uber Car: uberTzTerri
Lyft Car: TZ896289
Cleo is offline   Reply With Quote
Old 2004-10-21, 08:50 AM   #7
cd34
a.k.a. Sparky
 
cd34's Avatar
 
Join Date: Sep 2004
Location: West Palm Beach, FL, USA
Posts: 2,396
You can have it spider a page where you have access to the raw logs. Then grep the weblogs for the page that you had it spider.

You can:
grep -r User-Agent: *

in the directory where the script resides.

You could run ettercap, tcpdump or some other packet capture program on the server to watch packets as it does a check. Might be a bit overwhelming as a LOT of stuff might be going on at the same time. Should require root access to do this -- only for the insanely curious.

There are a bunch of other ways -- a cgi script that dumps the environment to a location on the server, or emails you the info when someone hits it, etc.
__________________
SnapReplay.com a different way to share photos - iPhone & Android
cd34 is offline   Reply With Quote
Old 2004-10-21, 09:20 AM   #8
Greenguy
The Original Greenguy (Est'd 1996) & AVN HOF Member - I Crop Pics For Thumbs In My Sleep
 
Greenguy's Avatar
 
Join Date: Feb 2003
Location: Blasdell, NY (shithole suburb south of Buffalo)
Posts: 41,929
Send a message via ICQ to Greenguy
Are you submitting it with the /index.htm at the end of just http://www.adevil.com/tgpweb/F14/b/ - reason I ask is that there's not index.html page in that directory & if the script/bot is thinking there should be one & not finding it, that may cause the 401 error.

Of course, I am not a tech guy
__________________

Promote POV Porn Cash By Building & Submitting Galleries to the Porn Luv Network
Greenguy is offline   Reply With Quote
Old 2004-10-21, 09:36 AM   #9
Mateusz
Screw you, guys. I'm going home.
 
Mateusz's Avatar
 
Join Date: Mar 2004
Location: Gliwice, Poland
Posts: 996
Send a message via ICQ to Mateusz
Neither adevil.com nor adevilsamateurs.com domain load for me at all
Mateusz is offline   Reply With Quote
Old 2004-10-21, 12:26 PM   #10
aphrophoto
If something goes wrong at the plant, blame the guy who can't speak English
 
Join Date: Aug 2003
Location: Kent, UK
Posts: 33
I have given the list of sites to my server guy who turned me down in my last set of uploads and he has come back with the following which may be of some info for anyone interested :-

''These sites are requesting URL's on your domain with a "^" character in the heading. This character is prohibited for security reasons and can be exploited. It is usually included in poorly written scripts and unnecessary to normal web server trafficing. However I have rescinded this security directive server wide to confirm/deny this fact. Please confirm that these link trades are now permissible?''

It will now be interesting to see what happens with my next batch of free site submissions.

With regards the .htm and html I tried both and that wasnt the problem and the adevil.com etc should have the prefix in front of both http://www.adevil.com/ - they seem to load ok for me - hope they are ok for everyone else
__________________
British Dollars
aphrophoto is offline   Reply With Quote
Old 2004-10-21, 02:19 PM   #11
cd34
a.k.a. Sparky
 
cd34's Avatar
 
Join Date: Sep 2004
Location: West Palm Beach, FL, USA
Posts: 2,396
personally, I don't think his argument holds water -- I demonstrated a GET request that didn't have a ^ in the request, however, he does appear to have fixed the problem.

telnet adevil.com 80
Trying 67.19.97.178...
Connected to 178.67-19-97.reverse.theplanet.com.
Escape character is '^]'.
GET /tgpweb/F14/b/index.htm HTTP/1.0
Host: adevil.com

HTTP/1.1 200 OK
Date: Thu, 21 Oct 2004 18:17:53 GMT
Server: Apache/1.3.31 (Unix) mod_auth_passthrough/1.8 mod_log_bytes/1.2 mod_bwlimited/1.4 FrontPage/5.0.2.2634a mod_ssl/2.8.20 OpenSSL/0.9.7a PHP-CGI/0.1b
Last-Modified: Wed, 20 Oct 2004 12:48:14 GMT
ETag: "28042d-31bb-41765e8e"
Accept-Ranges: bytes
Content-Length: 12731
Connection: close
Content-Type: text/htm

That request failed earlier.
__________________
SnapReplay.com a different way to share photos - iPhone & Android
cd34 is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -4. The time now is 03:37 PM.


Mark Read
Powered by vBulletin® Version 3.8.1
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
© Greenguy Marketing Inc