• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
am i loading this pdf data correctly or not?
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

am i loading this pdf data correctly or not?


  • Subject: am i loading this pdf data correctly or not?
  • From: Ben Dougall <email@hidden>
  • Date: Wed, 6 Aug 2003 00:04:59 +0100

i'm trying to parse the contents of a pdf file using a regex framework called AGRegex. it works fine until unicode type characters appear, then from then on it fails to get any of the expected matches. so the regex stops dead as soon as some unicode characters appear in the pdf data (which in actual fact was binary data rather than the unicode representations shown below).

the regex framework is supposed to work fine with unicode (based on pcre 4.0 - unicode compliant (if built correctly)). i think i've either incorrectly built the regex framework, without unicode support, or i'm incorrectly creating or setting up the string that gets passed to the regex methods. here's the code that sets up the string from the pdf:


NSArray *fileTypes = [NSArray arrayWithObject:@"pdf"];
NSOpenPanel *openPanel = [NSOpenPanel openPanel];
if ([openPanel runModalForDirectory:NSHomeDirectory() file:nil types:fileTypes] == NSOKButton) {
// load pdf file
pdfData = [[NSString alloc] initWithContentsOfFile:[[openPanel filenames] objectAtIndex:0]];


i print out the pdf data output using NSLog(@"%@", pdfData); and while it's like...:

<<
/Type /Font
/Subtype /Type1
/Name /F0
/BaseFont /Times-Roman
/Encoding /MacRomanEncoding
>>
endobj
15 0 obj

...everything's fine (the regex gets the expected matches). as soon as binary data occcurs, which looks like this after it's been through my above code...:

x\\u2044\\u2260W\\u20acr\\u2030\\u2202\\021\\u02dd\\307\\u02d8\\007<%\\3 72\\u2018\\016\\345;\\001\\u03a98Vv\\u25ca^{\\371\\u220f\\250\\2518UR\\0 36\\256!\\247\\260\\u2248!\\253C\\351\\02

...it goes wrong (fails to get matches thereafter, even once the binary/unicode data stops and returns to resembling the first snippet).

just to make clear: the above second snippet of data is (was) binary data and has been converted to that unicode style by my code that sets the string up.

so have i done the pdfData NSString incorrectly?
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.

  • Follow-Ups:
    • Re: am i loading this pdf data correctly or not?
      • From: Tom Sutcliffe <email@hidden>
  • Prev by Date: Dynamically setting a files icon
  • Next by Date: Re: Cross platform development
  • Previous by thread: Re: Dynamically setting a files icon
  • Next by thread: Re: am i loading this pdf data correctly or not?
  • Index(es):
    • Date
    • Thread