On Jun 2, 2005, at 10:25 AM, Dr. Smoke wrote: Hi, Vince. Thanks for your response!
On 2005.06.02, at 10:49 , Vince DeMarco wrote:
On Jun 2, 2005, at 7:12 AM, Dr. Smoke wrote:
Thanks. I keep hoping someone from Apple's Spotlight team will respond to this thread and clear-up the remaining questions:
1. Exactly how is Spotlight using SearchKit, both for content indexing and in search.
Spotlight is using SearchKit but only for content indexing. Beyond that does it really matter ???
I'd like to know the details. I think others would as well. Could be helpful in troubleshooting. Could also offer some insights useful to developers. Obviously, I'm not looking for "trade secrets" here, but rather a somewhat more detailed understanding of the processes.
I can also provide you with examples of where the indices get out of sync with the file system, and where rebuilding the indices twice in a row on a data-only, non-boot volume, returns ContentIndex.db files with two different sizes -- a 1MB difference -- from each build despite no underlying changes in the files being indexed. If you want to contact me offline for details, I'll be happy to provide them.
Some of this should be expected, the content index file will get compacted after some number of items have been inserted in it.
So the file will grow larger then settle down and be correct. 2. How kDMItemContent is populated if one has files whose UTIs match the types supported by the existing mdimporter objects.
kMDItemContent what key is this?? i assume you mean kMDItemContentType, basically Launch Services is asked for the UTI of the file in question.
Forgive my typo. I meant kMDItemTextContent.
Having now had a chance to read the recently-updated "Search Kit Reference" I think I may have a better idea of what is happening under the covers.
However, it might help to have some examples in the "Spotlight Importer Programming Guide" showing how one populates kMDItemTextContent in GetMetadataForFile.
Simply add the key kMDItemTextContent to the dictionary with the a CFString (or NSString) of the contents of your document as text.
I presume if kMDItemTextContent is not set for a document in a custom mdimporter, its text content is not added to ContentIndex.db. True?
Yes this is true.
Cases where some examples demonstrating the setting of kMDItemTextContent in a custom mdimporter might be helpful:
a. The Spotlight Importers can be employed for text extraction. For example, let's say one had an HTML document that contained strings within comments that I wanted to regard as custom metadata. To set these as custom metadata, I'd need to use a custom mdimporter. But Spotlight's bundled importers can also extract the text and other metadata.
We only extract text and other metadata from a certain set of files. in Launch services you can inherit from public.plain-text bug our public.plain-text importer will not run on your custom file, (Ie extract metadata or text content) your custom importer needs to do that.
b. One has to provide their own text extraction routine, i.e. the Spotlight importers do not match the type of document from which text content is to be extracted.
The current example shows only two standard metadata items and a custom metadata item, but does not address content, i.e. kMDItemTextContent.
;-) Doc
Vince
|