• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: 3rd Party Nonsense (was Re: Regular Expressions?)
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 3rd Party Nonsense (was Re: Regular Expressions?)


  • Subject: Re: 3rd Party Nonsense (was Re: Regular Expressions?)
  • From: Jens Alfke <email@hidden>
  • Date: Mon, 9 Jun 2008 20:17:37 -0700


On 8 Jun '08, at 3:39 AM, Michael Ash wrote:

I never cared about the lack of regex support personally, although I
understand that people do use them. As far as a blessed solution goes,
"man regex" gives you a library that's in libSystem and is part of
POSIX, so it's as supported as you can get.

And (as discussed a few weeks ago) it's not Unicode-savvy, which could bite the unwary developer in the ass, especially when attempting to localize their app into non-Roman languages like Japanese.


I do this with a fair amount of regularity. NSString is unsuitable for
working with data whose encoding is unknown or doubtful, and NSData
doesn't have any string-like functionality, so the standard C str
functions can be very useful here.

Ouch. The problem with those is that, every time you call one, you've added a potential buffer overrun bug to your app. And if the data in the string came from an untrusted source like the network, that escalates to a potential security vulnerability.


Also, speaking of doubtful encodings, the regular C string functions will fail quite badly on 16-bit character encodings, where it's more than likely that every other byte is a zero.

My general tactic when dealing with unknown data whose encoding can't be determined is to just fall back on CP-1252 [though Aki Inoue suggested MacRoman], both of which are supersets of ascii that map every byte to a character. That way you'll always get a non-nil NSString, and any ascii text in the original will come out unscathed. That's a better result than you'll get with C string APIs.

—Jens

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________

Cocoa-dev mailing list (email@hidden)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden

  • Follow-Ups:
    • Re: 3rd Party Nonsense (was Re: Regular Expressions?)
      • From: "Michael Ash" <email@hidden>
References: 
 >Regular Expressions? (From: "Cemil Browne" <email@hidden>)
 >Re: Regular Expressions? (From: Jason Stephenson <email@hidden>)
 >Re: Regular Expressions? (From: glenn andreas <email@hidden>)
 >Re: Regular Expressions? (From: "Stephen J. Butler" <email@hidden>)
 >Re: Regular Expressions? (From: Ilan Volow <email@hidden>)
 >3rd Party Nonsense (was Re: Regular Expressions?) (From: Jason Stephenson <email@hidden>)
 >Re: 3rd Party Nonsense (was Re: Regular Expressions?) (From: "Michael Ash" <email@hidden>)
 >Re: 3rd Party Nonsense (was Re: Regular Expressions?) (From: "Mark Munz" <email@hidden>)
 >Re: 3rd Party Nonsense (was Re: Regular Expressions?) (From: "Michael Ash" <email@hidden>)

  • Prev by Date: Re: 3rd Party Nonsense (was Re: Regular Expressions?)
  • Next by Date: Re: [Moderator] reminder - WWDC content other than the keynote is covered by NDA
  • Previous by thread: Re: 3rd Party Nonsense (was Re: Regular Expressions?)
  • Next by thread: Re: 3rd Party Nonsense (was Re: Regular Expressions?)
  • Index(es):
    • Date
    • Thread