Re: Converting from HTML
Re: Converting from HTML
- Subject: Re: Converting from HTML
- From: Drarok Ithaqua <email@hidden>
- Date: Sun, 19 Oct 2008 21:51:30 +0100
Following that link and pasting the code, I find it doesn't work
correctly. It's replacing & with a space!
All I'm trying to do is parse the RSS feed URL out of the <head>
section of a document. I didn't think it'd be so difficult!
As this app is going to be used on a specific site, I think I'll just
do manual string replacements for now, as it seems
the easiest solution.
On 18 Oct 2008, at 00:22, Aurora Phoenix wrote:
Hi DI... Depending on how heavy you need to understand the structure
of the
HTML, a simple parse using string chopping/ranges might be
sufficient OR
(personally I would prefer) use something like libxml2 / Xpath to
grok the
input. Note simple resolution of entities might not be sufficient,
particularly because you seem to be grabbing URI/URL... If you are
grabbing
URL with intent of passing them on to the URL Loading System (NSURL*,
NSHTTPURL*), you will on the back end also ensure that prior to
passing the
string itself is URLEncoded (e.g, replacing spaces with and such).
Someone else has posted the link to ThinkMac blog which has a
snippet for
cheap char entity resolution in Objective C (
http://www.thinkmac.co.uk/blog/2005/05/removing-entities-from-html-in-cocoa
.
html)
Cheers and good luck!
On 10/16/08 19:17 , "Drarok Ithaqua" <email@hidden> wrote:
Hi all, i'm trying to find a way to convert an HTML-originated URL
into one I can use in cocoa.
Example input: <link type="application/rss+xml" rel="alternate"
href="/
search/unique&stuff&here" />
I know the URL that this data is fetched from, so I can prefix that
to
achieve a full URL again, but I need to convert the &
into plain ampersands, but there could be all kinds of HTML
characters
in there. Is there a category on NSString out there I could
use for this?
I read somewhere that I could use an NSAttributedString and
initWithHTML, but that leaves me with an empty string. I'm guessing
because it's
inside a <head> tag? Not sure.
I'm also open to using something more intelligent than my current
method of searching the string for "<link " to find the rss feed, if
there's perhaps an
easier way that would also convert the HTML characters for me. Maybe
webkit has something for me?
I look forward to your replies, and you have my thanks in advance.
- Drarok
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
.
com
This email sent to email@hidden
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden