Mailing Lists: Apple Mailing Lists
Image of Mac OS face in stamp
Re: Extracting text objects from PDF page
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Extracting text objects from PDF page

On 6 Jul 2008, at 16:39, David Duncan wrote:

It looks like you want to write a CGPDFDictionaryApplierFunction and call CGPDFDictionaryApplyFunction with it. This will iterate the dictionary calling the applier function for each key & object in it.

Thanks David, that gets me the stream's length and filter name. What I was looking for though is the string of characters that are the contents of the Tj operator, and I just realised how to extract them from the scanner. Really simple, now that I begin to get an inkling of how it works:

void operator_Tj(CGPDFScannerRef scanner, void *info)
CGPDFStringRef pdfString;

// The Tj operator takes a string. Pop the string off the stack.
bool success = CGPDFScannerPopString(scanner, &pdfString);

// Convert the PDF string to an NSString and pass it back up.
if (success) {
((ScannedPageData *)info)->string = (NSString *)CGPDFStringCopyTextString(pdfString);
} else {
((ScannedPageData *)info)->string = nil;


And could you keep your heart in wonder
at the daily miracles of your life,
your pain would not seem less wondrous
than your joy.

--Kahlil Gibran

Do not post admin requests to the list. They will be ignored.
Quartz-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden

 >Extracting text objects from PDF page (From: Antonio Nunes <email@hidden>)
 >Re: Extracting text objects from PDF page (From: David Duncan <email@hidden>)

Visit the Apple Store online or at retail locations.

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2011 Apple Inc. All rights reserved.