Re: Unicode canonical decomposed form and text encoding
Re: Unicode canonical decomposed form and text encoding
- Subject: Re: Unicode canonical decomposed form and text encoding
- From: Renaud Boisjoly <email@hidden>
- Date: Tue, 14 Jan 2003 14:45:05 -0500
Thanks Aki!
I'll give this a try and let everyone know if it works. Thanks a
million!
Renaud
On Tuesday, January 14, 2003, at 02:39 PM, Aki Inoue wrote:
Renaud,
I cooked up a simple example of using TEC to canonically decompose.
#import <Foundation/Foundation.h>
static UniChar characters[] = {0x00C0}; // LATIN CAPITAL LETTER A WITH
GRAVE
#define MAX_BUFFER_LENGTH (100)
int main (int argc, const char * argv[]) {
NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
UnicodeToTextInfo textInfo;
UnicodeMapping mapping =
{CreateTextEncoding(kTextEncodingUnicodeDefault,
kTextEncodingDefaultVariant, kUnicode16BitFormat),
CreateTextEncoding(kTextEncodingUnicodeDefault,
kUnicodeCanonicalDecompVariant, kUnicode16BitFormat),
kUnicodeUseLatestMapping};
UniChar buffer[MAX_BUFFER_LENGTH];
ByteCount inputRead, outputLen;
OSStatus status;
status = CreateUnicodeToTextInfo(&mapping, &textInfo);
if (noErr != status) {
NSLog(@"Failed to create UnicodeToTextInfo");
exit(1);
}
status = ConvertFromUnicodeToText(textInfo, sizeof(characters),
characters, kTECKeepInfoFixMask, 0, NULL, NULL, NULL,
MAX_BUFFER_LENGTH * sizeof(UniChar), &inputRead, &outputLen, buffer);
if (noErr != status) {
NSLog(@"Failed to convert string");
exit(1);
}
DisposeUnicodeToTextInfo(&textInfo);
[pool release];
return 0;
}
I tested this on Jaguar, but supposed to work on earlier versions.
Pre-Jaguar TEC don't have precomposition capability.
Note I'm passing kTECKeepInfoFixMask to ConvertFromUnicodeToText()
above.
This bit keeps the last conversion info so that TEC ensures the
repeated conversion doesn't cause surprises with decomposed/surrogate
characters in the source buffer.
Hope this help,
Aki
On 2003.1.13, at 07:54 PM, Renaud Boisjoly wrote:
Hi all!
I'm trying to convert Unicode strings from the precomposed form to
the decomposed form (or the other way around) under 10.1
Under 10.2 I can use NSString's decomposedStringWithCanonicalMapping
method, which works fine.
But this is not supported under 10.1, which breaks my app.
Has anyone ever done this using Text Encoding COnverter or Unicode
Converter? Or perhaps by adapting GPL routines like GNOME's
libunicode? (http://cvs.gnome.org/lxr/source/libunicode/decomp.h and
http://cvs.gnome.org/lxr/source/libunicode/decomp.c)
I've never adapted regular C routines like these to Onjective-C and
my experience is quite limited in that area. Same goes with Carbon
routines like TEC.
If anyone is willing to share their experience with this type of
stuff, it would really help me out.
Here's something I got from this list which does encoding
conversions, but I'm not sure how I'm supposed to call this function
from my Cocoa code... I used kTextEncodingMacUnicode instead of the
one in the original code, but I'm not sure if that is the right
choice either...
+ (TECObjectRef) _unicodeID3v2ToMacTextConverter
{
static TECObjectRef id3v2ToMacTextConverter = NULL;
if (id3v2ToMacTextConverter == NULL) {
OSStatus status;
status = TECCreateConverter(&id3v2ToMacTextConverter,
kTextEncodingMacUnicode,
kTextEncodingUnicode);
if (status != 0) {
NSLog(@"TECCreateConverter() error %d", status);
return nil;
}
}
return id3v2ToMacTextConverter;
}
+ (CFStringRef /* implies caller retain */) _convertUnicodeText:
(ConstTextPtr) textBuffer ofLength:
(unsigned int) length
{
ByteCount actualInputConsumed;
ByteCount actualOutputProduced;
UInt8 outputBuffer[1024];
CFMutableStringRef returnString = CFStringCreateMutable(NULL, 0);
TECObjectRef textConverter = [self
_unicodeID3v2ToMacTextConverter];
do {
OSStatus status;
status = TECConvertText(textConverter, textBuffer, length,
&actualInputConsumed, outputBuffer,
sizeof(outputBuffer),
&actualOutputProduced);
if (status != 0) {
NSLog(@"TECConvertText() error %d", status);
return nil;
}
CFStringAppendCharacters(returnString, (const UniChar
*)outputBuffer, actualOutputProduced);
length = length - actualInputConsumed;
textBuffer = textBuffer + actualInputConsumed;
} while (length > 0);
return returnString;
}
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.