Re: Best Way To Lookup From a Huge Table
Re: Best Way To Lookup From a Huge Table
- Subject: Re: Best Way To Lookup From a Huge Table
- From: James Hober <email@hidden>
- Date: Mon, 17 Mar 2008 11:38:38 -0700
To me, "best" large table lookup comes down to a tradeoff:
1) If I can load all the data into memory, using say a hash table,
then the initial load time will be somewhat significant but the
lookups will be near instantaneous.
2) If I can look up the data from an ordered persistent store, the
initial load time can be extremely short but the look up times a
little longer because of disk access, yet still pretty fast if the
store is organized and searched efficiently.
I had a situation where I had about 170,000 unique strings that
mapped to 170,000 other strings.
My first implementation used Objective-C++ and a C++ STL map to do
the lookup (solution 1). Depending on the machine, it took on the
order of 2 to 7 seconds of time during the app launch to load the C++
map.
My current implementation uses a Core Data SQLite data base (solution
2). I have a separate Foundation Tool that creates the SQLite store,
indexes it and vacuums it. The user app launches very quickly and
the Core Data lookups are on the order of .01 seconds or less. Note
that the data rarely if ever changes so I don't have to worry about
the store gradually becoming fragmented and less efficient.
There are also other ways to do solution 2) as at least one other
person mentioned: For example, you can create a custom random access
file with records of a fixed length and do your own binary search of
them.
But basically you have to decide whether 1) or 2) is best for your
situation.
James
On 13 Mar 2008, at 21:11, Karan Lyons wrote:
What's the best way to lookup something from a huge table?
I'm trying to write a piece of code that checks weather data given
a zipcode. But I first need to change that zipcode into another
format in order for it to work with the online service I'm querying.
For example:
1) User inputs zipcode 02139.
2) Application looks up zipcode in table and finds that 02139 is
USMA0007
3) Application uses code USMA0007 to query the online service.
The table itself is pretty simple: It's just two columns, one with
every zipcode in the US, and the other with the corresponding
weather code. But there are 41,805 zipcodes in the US, so I'm not
sure of the best way to implement this. The table only needs to be
queried once, after that I can store the weather code in memory.
What's the best way to do this? Thanks for your help!
Namaste,
Karan_______________________________________________
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden