![]() |
Lets Talk Aggregator
First of all http://www.rssxxxfeed.com is about four months old now and for those that are curious about how it’s doing without reciprocal links here ya go:
Aug Unique visitors 54 Number of visits 105 SE Hits 36 Sep Unique visitors 1136 Number of visits 2639 SE Hits 394 Oct Unique visitors 2829 Number of visits 5527 SE Hits 1876 Nov Unique visitors 3350 Number of visits 6441 SE Hits 2139 Dec Unique visitors 4274 Number of visits 6763 SE Hits 3500 I’m aggregating 1013 blog feeds, have 53800 individual posts listed and it’s growing by 400+ posts a day. The two main reasons I decided on not requiring a link back are: 1, I have no track record, if I were Tommy, Linkster, Greenie or Walrus I would have. 2, I don’t want the Search Engines mistaking me for a link farm. The good thing about doing it this way is I don’t have to worry about pissing someone off because I deleted their blog (I’ve had to delete over 400 already mostly Thumbloger’s). The bad thing is finding the blogs and listing them is a lot of boring work. I have Links Organizer installed and will be trading links but I’m going to be real selective. Here’s the shits of the whole thing, the aggregator script I’m using is Newstopica and it’s slowed down to a fucking crawl. I’ve had Sparky and Kaktus take a look and they both have said the coding leaves a lot to be desired. Support for Newtopica is a friggin joke. I’m to pig headed to just shit can the whole thing, so I got in line to have Kaktus code a custom aggregator for me. Here’s where I need your help, if you were to have an aggregator written what features would you want? |
Support for tags and categories. I want to start an aggregator myself and I'm finding that most web aggregators don't seem to support it (at least by default). It's sad, I did a web search for PHP-based aggregators and the ones I've found are not that impressive.
|
Accept pings and limit the number of pings you'll accept per day. Tags and a tag cloud would be nice, especially if they read my tags or you could require a minimum of one tag per post that points to your domain and only take the tags that point to you. (I am personally under the opinion that blog roll links are pretty useless) Limit the number of front page posts so that surfers do use the tagging system rather than take the easy route and click whats in front of their face.
|
From what I understand, the ever-decreasing speed of an aggregator as it ages is due to the fact that many of them (aggregation scripts) go through and scan what you have stored in your database from feed A and compare it that to feed A's present state. That's a lot of work, especially when you have that many feeds being handled and a now monstrous database. A more efficient aggregator would look at the post dates and add only what's new from a feed, instead of checking to see if old posts have been updated/edited. If they didn't get it right the first time, fuck 'em.
|
I'd like one coded in C that generates new static pages every 2-5 minutes.
|
Quote:
http://www.planetplanet.org/ Bu if you know some C, you can always optimize the parts that need to be :) |
Quote:
Most feed aggregators should check for the server's Last-Modified/If-Modified-Since response or even better, its ETag/If-None-Match response. These responses tell you if a feed item has changed since the aggregator last checked it. You can read a little more about these here: http://diveintopython.org/http_web_s..._features.html |
Quote:
|
Quote:
|
Quote:
header('Last-Modified: ' . $date_string '); ?> |huh |
Quote:
Hey Cash, That looks like a nice script. I'll have to give it a shot. And python is just cool. Thanks! |
My little aggregator is 2 years old:
http://www.sexblogdemon.com/ It most important part is the intelligent java spider/bot. It has detection for rss feeded blogs and anti spam. It is now handling over 10.000 feeds and it still hosted on the thumblogger server, imagine that, i guess it can handle over 100.000 feeds with ease. I must admit it seems to be a bit overkill to make such a big app, but i guess i had much time back in the days :) |
Quote:
|
Quote:
|
Quote:
After this discussion, I'm thinking: * it makes more and more sense to just keep at 50 items or so per feed at any given time. I think this should be enough to keep visitors busy and should keep your script running not too badly. * Writing your own aggregator is the way to go. Most of the ones I've used were too immature and didn't scale well. Gregarius and Lilina (PHP-based) would both start to choke at around 15,000 feed items (I read a lot of tech blogs). |
wow lot's to think about thanx guys keep um coming
|
Sparky my hero did some of his vodoo shit last night and gave me a surprise Christmas present. He improved the hell out of the Newstopica script it's smokin fast now. |bananna|
|
All times are GMT -4. The time now is 09:52 AM. |
Powered by vBulletin® Version 3.8.1
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
© Greenguy Marketing Inc