Re: ARC question
Re: ARC question
- Subject: Re: ARC question
- From: glenn andreas <email@hidden>
- Date: Mon, 29 Oct 2012 09:30:16 -0500
On Oct 29, 2012, at 8:34 AM, Mike Abdullah <email@hidden> wrote:
>
> On 29 Oct 2012, at 11:44, Vincent Habchi <email@hidden> wrote:
>
>> Le 29 oct. 2012 à 12:34, Mike Abdullah <email@hidden> a écrit :
>>
>>> The code is a fairly inefficient to start with, but no, it's not going to leak.
>>
>> Thanks. I am aware of this, but since this code is going to be part of a didactic article on writing a WMS client, I emphasize clarity over performance (this is a secondary aspect).
>>
>> However, I am interested in knowing how you would write such a translator yourself to make it more efficient. I had initially the idea of copying every char until a ‘&’, in which case the following content would be analyzed and replaced if necessary, and so on until the end of the HTML string. That would mean one single pass instead of as many as the number of pairs in the dictionary.
>
> Well, you can ask CFXMLCreateStringByUnescapingEntities() to do this on OS X, although if I recall all the CFXML functions have now sadly been deprecated. The source code for it should still be available if you search around.
>
> But in general, I would just work my way through the string looking for occurrences of '&' and see if that makes up a valid escape sequence. Much of the problem if dealing with HTML rather than XML is that there are a vast range of special sequences. e.g. µ
>
Given that there are also decimal (&#DD;) and hexadecimal escape sequences (&#xHHHH;) in HTML, trying to support those through the use of a dictionary of sequence -> replacement is going to be impractical.
Scanning through the string to find & and test for valid escape sequences (including both the 250 or so named entities plus those numeric escape sequences) is the right way to go, since the time spent on the string is dependent on the number of escape sequences in the string, not the number of possible escape sequences.
Glenn Andreas email@hidden
<http://www.gandreas.com/> wicked fun!
Mad, Bad, and Dangerous to Know
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden