• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: How to get Unicode's "General Category" of a character?
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to get Unicode's "General Category" of a character?


  • Subject: Re: How to get Unicode's "General Category" of a character?
  • From: Dmitry Markman <email@hidden>
  • Date: Tue, 07 Jul 2015 20:48:54 -0400

Hi Gerriet

first of all it’s unicode/uchar.h header (not utypes.h)

I think it would be the best to download ICU distribution from

http://site.icu-project.org/download/55#TOC-ICU4C-Download


download sources and build it

in order to build you have to do the following


download and unarchive icu4c-55_1-src.tgz

cd icu
mkdir build
export CXXFLAGS='--std=c++11 --stdlib=libc++ -DUCHAR_TYPE=char16_t'   (or add --enable-debug for debug)
cd build
../source/configure --enable-shared --enable-static —prefix=<path_to_install_dir>
make
make install

in include/unicode/platform.h immediately after lines
#   if (defined(__cplusplus) && __cplusplus >= 201103L) || (defined(__STDC_VERSION__) && __STDC_VERSION__ >= 201112L)
#       define U_HAVE_CHAR16_T 1
add the following
#       define UCHAR_TYPE  char16_t


try ICU if you are getting error U_MISSING_RESOURCE_ERROR, then

rebuild data from build/data directory: touch Makefile and just run make


Note: I tried to use homebrew, but I wasn’t able to build c++11 libraries that use char16_t type

instructions from above will let you do just that

in order to build your application use the following switches

    LDFLAGS:  -L<path_to_install_dir>/lib
    CPPFLAGS: -I<path_to_install_dir>/include

hope it will help

ask me off-list if you have any problem

cheers

 dm







> On Jul 7, 2015, at 9:10 AM, Gerriet M. Denkmann <email@hidden> wrote:
>
>
>> On 7 Jul 2015, at 19:33, Dmitry Markman <email@hidden> wrote:
>>
>> ICU’s
>>
>> u_charType
>
> Looks exactly like what I need.
> But: are the headers and the library on my Mac?
>
> There is /usr/lib/libicucore.A.dylib which might contain u_charType, but I cannot find any headers (e.g. utypes.h).
>
> Do I have to download the source from ICU?
>
>
> Kind regards,
>
> Gerriet.
>
>
>
>>
>>
>>> On Jul 7, 2015, at 8:03 AM, Gerriet M. Denkmann <email@hidden> wrote:
>>>
>>> Given a character (a Unicode code point, to be exact) like U+FF0B (FULLWIDTH PLUS SIGN), I want to know the General Category of this.
>>> For this example it would be “Sm" (aka. Math_Symbol or Symbol, Math).
>>>
>>> I could download the current version of UnicodeData.txt and parse it.
>>> But this looks not very efficient.
>>>
>>> For punctuation one could use NSCharacterSet punctuationCharacterSet.
>>>
>>> But for Math Symbols?
>>>
>>> I did look at CFStringTransform, which can give the Character name via kCFStringTransformToUnicodeName.
>>>
>>> But I cannot find anything for “General Category"
>>>
>>> NSRegularExpression can match for [\p{General_Category = Math_Symbol}]; not quite what I want, but better than nothing.
>>>
>>>
>>> Any ideas?
>>>
>>> Gerriet.
>>>
>>>
>>> _______________________________________________
>>>
>>> Cocoa-dev mailing list (email@hidden)
>>>
>>> Please do not post admin requests or moderator comments to the list.
>>> Contact the moderators at cocoa-dev-admins(at)lists.apple.com
>>>
>>> Help/Unsubscribe/Update your Subscription:
>>>
>>> This email sent to email@hidden
>>
>> Dmitry Markman
>>
>

Dmitry Markman


_______________________________________________

Cocoa-dev mailing list (email@hidden)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden


References: 
 >How to get Unicode's "General Category" of a character? (From: "Gerriet M. Denkmann" <email@hidden>)
 >Re: How to get Unicode's "General Category" of a character? (From: Dmitry Markman <email@hidden>)
 >Re: How to get Unicode's "General Category" of a character? (From: "Gerriet M. Denkmann" <email@hidden>)

  • Prev by Date: Re: Any way to combine for and if-let?
  • Next by Date: Re: Any way to combine for and if-let?
  • Previous by thread: Re: How to get Unicode's "General Category" of a character?
  • Next by thread: Sizing NSScrollView width to exactly fit NSTableView
  • Index(es):
    • Date
    • Thread