|
|
|
|
|
|
![]() |
#1 |
You can now put whatever you want in this space :)
|
Lets Talk Aggregator
First of all http://www.rssxxxfeed.com is about four months old now and for those that are curious about how it’s doing without reciprocal links here ya go:
Aug Unique visitors 54 Number of visits 105 SE Hits 36 Sep Unique visitors 1136 Number of visits 2639 SE Hits 394 Oct Unique visitors 2829 Number of visits 5527 SE Hits 1876 Nov Unique visitors 3350 Number of visits 6441 SE Hits 2139 Dec Unique visitors 4274 Number of visits 6763 SE Hits 3500 I’m aggregating 1013 blog feeds, have 53800 individual posts listed and it’s growing by 400+ posts a day. The two main reasons I decided on not requiring a link back are: 1, I have no track record, if I were Tommy, Linkster, Greenie or Walrus I would have. 2, I don’t want the Search Engines mistaking me for a link farm. The good thing about doing it this way is I don’t have to worry about pissing someone off because I deleted their blog (I’ve had to delete over 400 already mostly Thumbloger’s). The bad thing is finding the blogs and listing them is a lot of boring work. I have Links Organizer installed and will be trading links but I’m going to be real selective. Here’s the shits of the whole thing, the aggregator script I’m using is Newstopica and it’s slowed down to a fucking crawl. I’ve had Sparky and Kaktus take a look and they both have said the coding leaves a lot to be desired. Support for Newtopica is a friggin joke. I’m to pig headed to just shit can the whole thing, so I got in line to have Kaktus code a custom aggregator for me. Here’s where I need your help, if you were to have an aggregator written what features would you want?
__________________
How To Keep An Asshole In Suspense
I'll Tell You Later |
![]() |
![]() |
![]() |
#2 |
Rock stars ... is there anything they don't know?
Join Date: Dec 2006
Posts: 14
|
Support for tags and categories. I want to start an aggregator myself and I'm finding that most web aggregators don't seem to support it (at least by default). It's sad, I did a web search for PHP-based aggregators and the ones I've found are not that impressive.
|
![]() |
![]() |
![]() |
#3 |
Oh no, I'm sweating like Roger Ebert
|
Accept pings and limit the number of pings you'll accept per day. Tags and a tag cloud would be nice, especially if they read my tags or you could require a minimum of one tag per post that points to your domain and only take the tags that point to you. (I am personally under the opinion that blog roll links are pretty useless) Limit the number of front page posts so that surfers do use the tagging system rather than take the easy route and click whats in front of their face.
|
![]() |
![]() |
![]() |
#4 |
Certified Nice Person
|
From what I understand, the ever-decreasing speed of an aggregator as it ages is due to the fact that many of them (aggregation scripts) go through and scan what you have stored in your database from feed A and compare it that to feed A's present state. That's a lot of work, especially when you have that many feeds being handled and a now monstrous database. A more efficient aggregator would look at the post dates and add only what's new from a feed, instead of checking to see if old posts have been updated/edited. If they didn't get it right the first time, fuck 'em.
|
![]() |
![]() |
![]() |
#5 |
Lord help me, I'm just not that bright
|
I'd like one coded in C that generates new static pages every 2-5 minutes.
|
![]() |
![]() |
![]() |
#6 | |
Rock stars ... is there anything they don't know?
Join Date: Dec 2006
Posts: 14
|
Quote:
http://www.planetplanet.org/ Bu if you know some C, you can always optimize the parts that need to be ![]() |
|
![]() |
![]() |
![]() |
#7 | |
Rock stars ... is there anything they don't know?
Join Date: Dec 2006
Posts: 14
|
Quote:
Most feed aggregators should check for the server's Last-Modified/If-Modified-Since response or even better, its ETag/If-None-Match response. These responses tell you if a feed item has changed since the aggregator last checked it. You can read a little more about these here: http://diveintopython.org/http_web_s..._features.html |
|
![]() |
![]() |
![]() |
#8 | |
Lord help me, I'm just not that bright
|
Quote:
|
|
![]() |
![]() |
![]() |
#9 |
Rock stars ... is there anything they don't know?
Join Date: Dec 2006
Posts: 14
|
|
![]() |
![]() |
![]() |
#10 | |
Lord help me, I'm just not that bright
|
Quote:
<?php header('Last-Modified: ' . $date_string '); ?> ![]() |
|
![]() |
![]() |
![]() |
#11 | |
Lord help me, I'm just not that bright
|
Quote:
Hey Cash, That looks like a nice script. I'll have to give it a shot. And python is just cool. Thanks! |
|
![]() |
![]() |
![]() |
#12 |
Shut up brain, or I'll stab you with a Q-tip!
Join Date: Dec 2005
Posts: 118
|
My little aggregator is 2 years old:
http://www.sexblogdemon.com/ It most important part is the intelligent java spider/bot. It has detection for rss feeded blogs and anti spam. It is now handling over 10.000 feeds and it still hosted on the thumblogger server, imagine that, i guess it can handle over 100.000 feeds with ease. I must admit it seems to be a bit overkill to make such a big app, but i guess i had much time back in the days ![]() |
![]() |
![]() |
![]() |
#13 |
Certified Nice Person
|
Then you drop their feed. Done! One can't expect to let a site to run itself and end up with quality. And if one forces an aggregator to look at every post in a heavy database, you must except that the slow down is going to occur.
Last edited by Useless; 2006-12-21 at 07:34 AM.. |
![]() |
![]() |
![]() |
#14 |
Rock stars ... is there anything they don't know?
Join Date: Dec 2006
Posts: 14
|
|
![]() |
![]() |
![]() |
#15 | |
Rock stars ... is there anything they don't know?
Join Date: Dec 2006
Posts: 14
|
Quote:
![]() After this discussion, I'm thinking: * it makes more and more sense to just keep at 50 items or so per feed at any given time. I think this should be enough to keep visitors busy and should keep your script running not too badly. * Writing your own aggregator is the way to go. Most of the ones I've used were too immature and didn't scale well. Gregarius and Lilina (PHP-based) would both start to choke at around 15,000 feed items (I read a lot of tech blogs). |
|
![]() |
![]() |
![]() |
#16 |
You can now put whatever you want in this space :)
|
wow lot's to think about thanx guys keep um coming
__________________
How To Keep An Asshole In Suspense
I'll Tell You Later |
![]() |
![]() |
![]() |
#17 |
You can now put whatever you want in this space :)
|
Sparky my hero did some of his vodoo shit last night and gave me a surprise Christmas present. He improved the hell out of the Newstopica script it's smokin fast now.
![]()
__________________
How To Keep An Asshole In Suspense
I'll Tell You Later |
![]() |
![]() |
![]() |
|
|