Re: Reading in dictionary from txt file: options for speed
Re: Reading in dictionary from txt file: options for speed
- Subject: Re: Reading in dictionary from txt file: options for speed
- From: WT <email@hidden>
- Date: Tue, 14 Apr 2009 23:27:30 +0200
Hi Miles,
I wrote a little iPhone app to test loading the standard UNIX
dictionary (/usr/share/dict/web2, 234,936 words). If you'd like, you
can download the XCode 3.1.x project from here:
http://www.restlessbrain.com/DictTest.zip
I don't actually have an iPhone, so I only tested it on the simulator.
I tried 3 kinds of files:
a) the dictionary stored as a txt file, UTF-8 encoded (2.4 MB).
b) the dictionary stored as a xml plist (6.4 MB).
c) the dictionary stored as a bin plist (3.7 MB).
(I created (b) and (c) from (a) by using a text editor to wrap each
line with the appropriate <string></string> tags (thank you grep!),
then saved the resulting file twice, once as xml and again as bin,
using Property List Editor.app)
The results (again, on the simulator and on my machine) are shown in
the screen shot inside the project directory (Picture 1.png) and are
that there's little difference between txt (0.27 sec) and bin (0.29
sec), but xml takes about twice as long (0.64 sec). Of course, to get
a statistically relevant result, you should change the code to reload
several times and take the average value for each file, but what I did
is enough to get an idea of the times involved.
I am curious what the times would be on the actual device. If you give
it a try, please let me know.
Hope this helps.
Wagner
On Apr 14, 2009, at 8:12 PM, Miles wrote:
[This is sort of in continuation of the thread "Build Settings for
Release:
App/Library is bloated", which gradually changed topics.]
I'm trying to find the best way to load in a 2MB text file of
dictionary
words and be able to do quick searches.
Simply loading the uncompressed txt file takes about 0.5 seconds
which I can
handle. But when I used the following to create an array of the
words from
the file:
NSArray *lines = [stringFromFileAtPath componentsSeparatedByString:@
"\n"];
... it took about 13 seconds, which is way too long.
I'm not super concerned about the 2MB of disk space the txt file
takes up,
although I wouldn't be mad about decreasing it somehow. And once I
get the
whole dictionary in an array, the searches are basically fast enough
for my
purposes. I've still been reading up on Huffman encoding if I decide
to try
to compress this. However, my main issue now is loading time, and it
seems
like this won't help me there.
And, I'm looking into creating a Trie (which is where the previous
thread
guided me), although I'm not sure this helps my current issue of
loading
time either. I'm thinking that creating a Trie will probably take
just as
long, or longer, than simply splitting the file using
'componentsSeparatedByString', right? So, is there some way to store
the
trie on disk so that the loading is my final data structure is
faster? What
other options do I have to speed this up?
Thanks!
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden