Re: am i loading this pdf data correctly or not?
Re: am i loading this pdf data correctly or not?
- Subject: Re: am i loading this pdf data correctly or not?
- From: Ben Dougall <email@hidden>
- Date: Thu, 7 Aug 2003 13:24:34 +0100
On Thursday, August 7, 2003, at 12:39 pm, Marcel Weiher wrote:
i realised that the streams in their raw form were not useable as
they were, but i didn't realise they would cause outright problems.
other than the streams, pdfs are ascii i think, or maybe an 8bit char
set. i was hoping / expecting the streams to be just ignored.
Since the streams are random binary junk, you can't ignore them by
parsing through them. After all, it is perfectly permissible for them
to contain the character sequence "endstream".
at the moment i'm not in anyway attempting to parse it with any pdf
semantics in mind (the data could have a 100 'endstream's in - wouldn't
make any difference because i'm not looking out for endstream yet at
all - i'm just starting doing this so these are initial steps). the
data that's between stream and endstream contains something that messes
the regular expression's operation up (not messes up my matching
pattern but the whole operation - it stops) - maybe there's a bug in
the regex i'm using? obviously regex doesn't care about pdf semantics.
something in the particular stream of data is causing regex to break /
stop. i think there may well be a bug in the regex i'm using. i've
described this to the person who wrote the regex cocoa wrapper that i'm
using and they were perplexed by the regex being stopped in the data
part and asked me to send the code and file i'm parsing which i did
yesterday so i'm waiting for the outcome of that.
seeing as my code did get all the pdf data into an NSString (maybe
incorrectly as the data between stream and endstream looked like ...
\\001\\u03a98Vv\\u25ca^{\\371\\u220f\\2... after import which is very
different to how it looks in the original pdf data) the regex shouldn't
be stopped by some data like that i don't think? it maybe incorrect
data but that shouldn't make a jot of difference to the regex operation
/ implementation itself - it should carry on through / past that.
thanks.
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.