Mailing Lists: Apple Mailing Lists

Image of Mac OS face in stamp
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Converting from HTML



Following that link and pasting the code, I find it doesn't work correctly. It's replacing & with a space!

All I'm trying to do is parse the RSS feed URL out of the <head> section of a document. I didn't think it'd be so difficult!

As this app is going to be used on a specific site, I think I'll just do manual string replacements for now, as it seems
the easiest solution.


On 18 Oct 2008, at 00:22, Aurora Phoenix wrote:

Hi DI... Depending on how heavy you need to understand the structure of the
HTML, a simple parse using string chopping/ranges might be sufficient OR
(personally I would prefer) use something like libxml2 / Xpath to grok the
input. Note simple resolution of entities might not be sufficient,
particularly because you seem to be grabbing URI/URL... If you are grabbing
URL with intent of passing them on to the URL Loading System (NSURL*,
NSHTTPURL*), you will on the back end also ensure that prior to passing the
string itself is URLEncoded (e.g, replacing spaces with %20 and such).


Someone else has posted the link to ThinkMac blog which has a snippet for
cheap char entity resolution in Objective C (
http://www.thinkmac.co.uk/blog/2005/05/removing-entities-from-html-in-cocoa .
html)



Cheers and good luck!


On 10/16/08 19:17 , "Drarok Ithaqua" <email@hidden> wrote:

Hi all, i'm trying to find a way to convert an HTML-originated URL
into one I can use in cocoa.

Example input: <link type="application/rss+xml" rel="alternate" href="/
search/unique&amp;stuff&amp;here" />


I know the URL that this data is fetched from, so I can prefix that to
achieve a full URL again, but I need to convert the &amp;
into plain ampersands, but there could be all kinds of HTML characters
in there. Is there a category on NSString out there I could
use for this?


I read somewhere that I could use an NSAttributedString and
initWithHTML, but that leaves me with an empty string. I'm guessing
because it's
inside a <head> tag? Not sure.

I'm also open to using something more intelligent than my current
method of searching the string for "<link " to find the rss feed, if
there's perhaps an
easier way that would also convert the HTML characters for me. Maybe
webkit has something for me?

I look forward to your replies, and you have my thanks in advance.

 - Drarok
_______________________________________________

Cocoa-dev mailing list (email@hidden)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/aurora.phoenix.draco%40gmail .
com


This email sent to email@hidden



_______________________________________________

Cocoa-dev mailing list (email@hidden)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/email@hidden

This email sent to email@hidden
References: 
 >Re: Converting from HTML (From: Aurora Phoenix <email@hidden>)



Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2007 Apple Inc. All rights reserved.