• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: RegEx libraries & unicode support
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RegEx libraries & unicode support


  • Subject: Re: RegEx libraries & unicode support
  • From: Allan Odgaard <email@hidden>
  • Date: Fri, 14 May 2004 15:17:48 +0200

On 14. May 2004, at 8:43, Nicholas Riley wrote:

Not to belittle any of the dozen regular expression libraries recently
mentioned, but do any of them support unicode? mainly I am thinking [...]
At least AGRegex (PCRE) and OgreKit (OniGuruma) have solid Unicode
support; I'm not sure of the individual things you mentioned, but
they're easy enough to download and try out yourself.

This is what I could find at the PCRE page:

[...] the characters that PCRE recognizes as digits, spaces,
or word characters remain the same set as before, all with
values less than 256.

Case-insensitive matching applies only to characters whose
values are less than 256

PCRE does not support the use of Unicode tables and properties

Also, dot and repeats match single code-points (i.e. a base char or a combining mark, but never both).

I believe MOKit is also based on PCRE, so the same would apply here, which is not what I call strong unicode support, see http://www.unicode.org/unicode/reports/tr18/

I wasn't able to find a specification of OniGuruma's level of support.
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.


  • Follow-Ups:
    • Re: RegEx libraries & unicode support
      • From: Jim Correia <email@hidden>
References: 
 >RegEx libraries & unicode support (From: Allan Odgaard <email@hidden>)
 >Re: RegEx libraries & unicode support (From: Nicholas Riley <email@hidden>)

  • Prev by Date: Re: Interface builder
  • Next by Date: NSXMLParser abortParsing
  • Previous by thread: Re: RegEx libraries & unicode support
  • Next by thread: Re: RegEx libraries & unicode support
  • Index(es):
    • Date
    • Thread