Invalid strings and related bugs (#5775749)
Invalid strings and related bugs (#5775749)
- Subject: Invalid strings and related bugs (#5775749)
- From: Nir Soffer <email@hidden>
- Date: Sun, 2 Mar 2008 04:44:20 +0200
I found that it is possible to get invalid strings from a PowerPoint
file using applescript. The invalid string can not be converted to
UTF-8 and corrupt NSXMLDocument.
The problem occur when iterating paragraphs in a shape. Iterating
lines returns correct string. However, the issue is that NSString
accept invalid data without any error, and returns an invalid
instance. You can trim the invalid instance, get a mutable copy and
replace characters etc. When you try to use it in NSXMLDocument, it
will corrupted the output silently.
When you try to log such string with NSLog - it fails silently - the
log line never appear! CFShow does print the string and show the some
junk inside it.
On 10.4.11, the string is truncated by applescript. It can be
converted to UTF-8 and logged with NSLog. When using in
NSXMLDocument, it still corrupt the document, but does not truncate it.
I reported the bug (#5775749), but I guess that others would like to
know about this issue.
Here is example code that show the bug. To reproduce, download the
source and example files from <http://nirs.freeshell.org/files/
invalid-string.tbz>
// Run this in the directory where "Slide Text.scpt" is located.
// Compile: cc invalid-string.m -o invalid-string -framework Cocoa
#import <Cocoa/Cocoa.h>
int main () {
NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
NSDictionary *errorInfo = nil;
// Get a list of shape text
NSURL *url = [NSURL fileURLWithPath:@"Slide Text.scpt"];
NSAppleScript *script = [[[NSAppleScript alloc]
initWithContentsOfURL:url
error:&errorInfo] autorelease];
if (script == nil) {
NSLog(@"Cannot load script: %@", errorInfo);
exit(1);
}
NSAppleEventDescriptor *result = [script
executeAndReturnError:&errorInfo];
if (result == nil) {
NSLog(@"Script error: %@", errorInfo);
exit(1);
}
NSString *slideText = [result stringValue];
//// Bugs:
// 1. The string contains junk - probably PowerPoint bug - but
NSString should return nil or truncate the invalid data
CFShow(slideText);
// 2. NSLog fail silently when printing this string - the log
line is simply missing!
NSLog(@"slide text: %@", slideText);
// 3. The string can not be converted to utf8 (returns NULL):
printf("utf8 string: %s\n", [slideText UTF8String]);
// 4. xml data is corrupted without any error; the element
containing the invalid string is missing, part of the string apear,
and the document is truncated after the invalid string:
NSXMLDocument *doc = [NSXMLDocument document];
[doc setCharacterEncoding:@"UTF-8"];
NSXMLElement *root = [NSXMLElement elementWithName:@"doc"];
[doc setRootElement:root];
NSXMLElement *slide = [NSXMLElement elementWithName:@"slide"];
[root addChild:slide];
[slide setStringValue:slideText];
NSData *data = [doc XMLDataWithOptions:NSXMLNodePrettyPrint];
NSString *xmlString = [[[NSString alloc] initWithData:data
encoding:NSUTF8StringEncoding] autorelease];
printf("%s\n", [xmlString UTF8String]);
[pool release];
return 0;
}
Best Regards,
Nir Soffer
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden