Re: PubSub Framework Alternative
Re: PubSub Framework Alternative
- Subject: Re: PubSub Framework Alternative
- From: Jens Alfke <email@hidden>
- Date: Sun, 12 Aug 2012 11:29:57 -0700
On Aug 12, 2012, at 4:34 AM, Christian Kienle <email@hidden> wrote:
> I reported these assertions years ago but did get no response.
Disclaimer: I wrote about ⅔ of the PubSub framework.
After I left Apple at the end of 2007 I don't think anyone else put any work into the framework.
> If I had to do it again I would simply use the NSXML* classes to write my own Atom/RSS parser. It should not be that hard and you can certainly do a better job as the PubSub team did.
*hollow laugh*
Fetching feeds is _hard_, at least if you want to fetch arbitrary feeds. RSS is a horribly vague file format, and many of the people implementing feeds on websites seem to just dump content into PHP templates instead of using any real XML API, so you run into lots and lots of problems like
— There are at least 9 (by Mark Pilgrim's count) different published dialects of RSS and Atom
— There are various metadata extensions like Dublin Core you have to be aware of
— Some feeds are malformed XML and have to be cleaned up before they can be parsed at all
— There are so very many different date formats people use. I think we ran into at least 20. You need either a very smart custom date parser or just a list of 20+ formats to attempt to parse, one after another. Oh, and time zones and DST are super fun to deal with.
— A lot of article bodies contain malformed HTML, often just tag soup that some blogger typed in by hand
— Many feeds have problems with quoting in headlines or article bodies, which requires using heuristics to figure out whether or not they really meant "&" or an ampersand
— Likewise for whether articles are HTML or plain text
— Uniquing items between fetches can be challenging, especially for older feed formats that don't have article UUIDs or permalinks.
— Remember to use conditional GETs or websites will get mad at you for spamming their feeds
Now, if there's only one feed that your application needs to fetch, and if you're responsible for creating that feed on the server side (or can influence the people who are), a lot of these problems go away because you can can enforce that everything is using the correct formats. But we weren't so lucky.
—Jens
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden