Mailing Lists: Apple Mailing Lists
Image of Mac OS face in stamp
Re: Extracting text objects from PDF page
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Extracting text objects from PDF page



On 6 Jul 2008, at 16:39, David Duncan wrote:

It looks like you want to write a CGPDFDictionaryApplierFunction and call CGPDFDictionaryApplyFunction with it. This will iterate the dictionary calling the applier function for each key & object in it.

Thanks David, that gets me the stream's length and filter name. What I was looking for though is the string of characters that are the contents of the Tj operator, and I just realised how to extract them from the scanner. Really simple, now that I begin to get an inkling of how it works:


void operator_Tj(CGPDFScannerRef scanner, void *info)
{
CGPDFStringRef pdfString;

// The Tj operator takes a string. Pop the string off the stack.
bool success = CGPDFScannerPopString(scanner, &pdfString);

// Convert the PDF string to an NSString and pass it back up.
if (success) {
((ScannedPageData *)info)->string = (NSString *)CGPDFStringCopyTextString(pdfString);
} else {
((ScannedPageData *)info)->string = nil;
}
}


António

-----------------------------------------------------------
And could you keep your heart in wonder
at the daily miracles of your life,
your pain would not seem less wondrous
than your joy.

--Kahlil Gibran
-----------------------------------------------------------



_______________________________________________
Do not post admin requests to the list. They will be ignored.
Quartz-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


References: 
 >Extracting text objects from PDF page (From: Antonio Nunes <email@hidden>)
 >Re: Extracting text objects from PDF page (From: David Duncan <email@hidden>)



Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2011 Apple Inc. All rights reserved.