Re: Reading in dictionary from txt file: options for speed
Re: Reading in dictionary from txt file: options for speed
- Subject: Re: Reading in dictionary from txt file: options for speed
- From: Greg Guerin <email@hidden>
- Date: Thu, 16 Apr 2009 23:16:41 -0700
Miles wrote:
I'm creating a game for where the dictionary file will never get
modified,
so I'm not really worried about that.
I was pretty sure the dictionary was read-only, but that doesn't mean
it's always error-free. I was actually thinking of production
errors, where the dictionary file accidentally has its final null
lost or removed.
Such things have been known to happen, and from small errors grave
disorder follows unless you code defensively. Sometimes a good
design isn't just defending against user mistakes, it's also
defending against your own mistakes.
Yes, the original suggestion to use strstr().
Not strnstr?
http://lists.apple.com/archives/cocoa-dev/2009/Apr/msg01221.html
It is slow, for a word that's
never found it takes 0.704 seconds on the device. Definitely too long.
So what do you think would be fast enough. For example, if it were
twice as fast, would that suffice? 10 times faster? 100 times?
Remember, it doesn't have to be As Fast As Possible. It only has to
be Fast Enough.
Next, think about the minimum work needed to achieve the desired
speedup. Given a search function and a dictionary, what is the
simplest way to make that search function finish in half the time?
The first thing that comes to mind for me is to cut the dictionary in
half. Boom, the same search function now finishes in half the time.
Or cut the dictionary to 1/10 its original size, and the same search
function finishes in 1/10 the time. Or 1/100. Or whatever.
Now think about multi-volume encyclopedias, the physical book kind.
http://en.wikipedia.org/wiki/File:Brockhaus_Lexikon.jpg
It's impossible to make a single book large enough to hold all the
content, because physical books have physical limits. Instead, the
material is broken into equal-sized sections purely for physical
convenience, and that section is put into a single book or volume.
Within each volume, the content is ordered alphabetically. On the
spine of each volume is a short label idnetifying the range it
covers, such as a single letter of the alphabet, or portions of the
first and last item-names in that volume. If there were only one
label on a volume's spine, it should be the first item in a volume,
because you can just look at the first item of the next volume to
deduce the last possible item in the current volume.
Finally, consider the steps needed to search the entire encyclopedia
for a single given term. The first step is to identify which volume
to use, by looking at the spine labels, then you open only one volume
and search it for the term of interest. You gain enormous speed by
simply eliminating most of the material, without ever having to open
those volumes or even take them off the bookshelf.
http://en.wikipedia.org/wiki/Encyclopedia
NSString *filePath = [[NSBundle mainBundle]
pathForResource: @"dictionary"
ofType: @"txt"];
stringFileContents = [[NSData alloc]
initWithContentsOfMappedFile:filePath];
Does stringFileContents have a null-terminator or not? If not, then
re-read what the arguments to strstr() need to be. Does
[stringFileContents bytes] provide data in the format strstr() needs
to work correctly? Small errors; grave disorder.
-- GG
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden