FeedDigest is hovering around 28-34 requests a second right now, around 2.5 million requests per day, and seems to be handling it pretty well after the database and infrastructure changes recently. Perl and memcached are handling it all like a champ, with memcached deserving extra credit for significantly improving the caching situation at FeedDigest.
The next problem is turning on the statistics features without frying everything. I've had to rethink my statistics analysis strategy. Doing an UPDATE on a statistics database table on each request is not viable at all. I'm considering two approaches:
1) Have a cronned job parse the general log file for FeedDigest every minute and update statistics in one go each minute.
2) Get the FeedDigest daemon to stack statistics data in memcached for another script to pull out each minute to update statistics in the DB.
The main difference between the approaches is that one is more 'standard' and relies on log files we generate already, but does place more load on the file system.. and the other is memory based, but potentially more 'breakable'.
Since statistics aren't as essential as overall performance, I'll probably go with the memory storage system, especially if statistics are updated every minute or so. Losing a minutes' worth of statistics from time to time would be acceptable given the performance benefits of doing it in memory.
Of course, in an ideal world we'd be able to afford a server with 4GB of memory in the data center and run almost everything through memcached, but one step at a time.. :)
Just pull a Feedburner and update the stats once a day.
Posted by: PJ Hyett at December 14, 2005 12:09 AMI design services to work in a way that fits with me, generally, as that keeps me motivated to implement all the features ;-) I like checking my statistics on a very regular basis, and so would want visitor and clickthrough numbers to be updating almost constantly.
Posted by: Peter Cooper at December 14, 2005 11:54 AMReturn to the homepage.
Privacy Policy