What is MetaRSS?

June 3rd, 2006

Metafilter offers RSS feeds for the front pages of MeFi and AskMe. While I am thankful for the feeds, they do not help me keep up with activity after I’ve seen a thread. Every hour MetaRSS updates threads that it knows about, for up to thirty days. When your RSS reader hits MetaRSS the next time, the newest comments will be visibile in your RSS reader. (Please don’t update your RSS more frequently than once an hour.) Read the rest of this entry »

MetaRSS moved!

June 4th, 2006

As much fun as playing with Joomla was, I just don’t have the time or patience to stick with it.

Not to mention, the end result of said futzing was, well, pretty unusable. Therefore, MetaRSS has moved to this mostly empty blog. None of the script locations have changed, but the user interface doesn’t need the complications of an obtuse CMS. All of the tools you may have had trouble finding previously are now handily available on the pages of comma space. Read the rest of this entry »

MetaRSS: The Open Source Project

October 10th, 2006

It’s no secret: MetaRSS was hacked together with the best of intentions in the least amount of time possible. As such, MetaRSS went live in alpha and never saw much further development. Since that time, the name has been taken by another company, several bugs have been discovered (more on that later), reported, and fixed. Given that there is, at the very least, a modicum of interest in extensible RSS feeds for Metafilter, and little promise that they’ll actually show up as a feature, I’ve decided to put the code out there. Read the rest of this entry »

SourceForge

October 14th, 2006

MetaRSS was approved, though DNS still has not updated.

More to come.

MetaRSS Diagram

October 14th, 2006

MetaRSS Workflow

Essentiall, MetaRSS is comprised of the following components:

  1. A cron job
  2. An indexer
  3. Metafilter content
  4. An HTML cache
  5. The user interface

The process is initiated by a user request for an RSS feed. This is currently done with a Greasemonkey script or a bookmarklet. The request simply feeds the user interface a URL. The user interface caches the HTML and stores the url for user later by the indexer. The user interface parses the html and returns the RSS to the user. At fixed intervals, a cron jobs kicks off the indexer, which simply caches the HTML, which is parsed when a user requests the cache.

Pretty simple and open to a lot of improvement.

For example, the parser is very rigid. Essentially it’s hard coded, using some weird combo of HTML::Parser and regular expressions. If the parser could scrape more flexibly, MetaRSS could return more types of RSS feeds.

Additionally, the algorithm that indexes the requested Metafilter pages could be optimized by using last modified times to determine whether or not to grab the whole page.