2004-04-01 · in Ideas · 205 words

Many rawdog users run it on a shared machine provided by their web hosting company. This is fine if only one user's on each machine, but it's inefficient if there are two rawdog users who subscribe to some of the same feeds; rawdog has to fetch them twice. Some sites (such as Slashdot) detect this and block hosts that fetch their feed too often.

A simple way to get around this would be a web proxy that's designed specifically to deal with RSS feeds: it would make certain that no feed is fetched more frequently than every 30 minutes (and this could be made customisable for feeds that don't change very often). A bonus would be that it'd be easy to implement best-practises methods in the proxy -- gzip encoding, ETag support -- to minimise bandwidth usage, and individual RSS aggregators could just do a simple HTTP fetch.

It'd also be possible to modify feed data on the fly: fixing invalid feeds, or removing uninteresting articles or advertising.

Shrook does something related to this, but it's rather more complex (and non-standardised): aggregators ask a central server if it has a recent copy of a page, and if not they fetch it and send it to the server.