Re: Display csv in a tableView with bindings
Re: Display csv in a tableView with bindings
- Subject: Re: Display csv in a tableView with bindings
- From: "I. Savant" <email@hidden>
- Date: Sun, 26 Jul 2009 10:53:29 -0400
On Jul 26, 2009, at 6:32 AM, Aaron Burghardt wrote:
Neither, you want an array of dictionaries where each row of CSV is
a dictionary in which the values keyed to column names and each row
of CSV is one dictionary object in the array.
This is a bit more complicated than that, actually.
There's a bit of a catch-22 here. On the one hand, you have a
performance consideration. On the other, you have an ease-of-
programming consideration. Using NSDictionary is easier, but for
moderately-sized files it is noticeably slow, for large files, it's
unusably so.
If you go the dictionary route, using the keys to identify the
"fields" in each row, you're storing *way* more than just the
individual field contents. You're storing a copy of your field
identifier keys for every field, for every row. Best-case scenario,
you're storing a pointer to some object that represents the "column"
to which the fields belong, but this defeats the ease-of-use with
bindings as you need string keys. As I mentioned above, with
increasingly large files, this dramatically increases your reading/
writing time and uses a lot of memory. But at least you get the
ability to easily use bindings and to sort, all for free, performance
be damned.
If you go another route (an array of arrays of strings), it's far
more efficient, but adds a few programming complexities:
1 - How do you sort by a column? There's no key for sort descriptors
and sorting via selector provides no way to pass additional
information (such as column index or identifier).
2 - To what do you bind? The same limitation that causes concern in
problem #1 makes #2 difficult ... and there is little by way of over-
the-counter laxative to make #2 less difficult.
3 - If you intend to allow reordering of columns (built-in NSTableView
feature) or even adding/removing columns, how do you handle keeping
the columns mapped to the correct fields in the row array in the
absence of an associative array (dictionary)?
The easiest solution to all three of these problems (in my opinion)
is to make a "row" a custom class and a helper class (we'll call it
"ColumnMapper" - one mapper shared among all rows). The row's internal
storage can still be an array of strings for low overhead, but the Row
class has a trick up its sleeve. It overrides -valueForUndefinedKey:
so that it can still look up associative values (like a dictionary)
but without storing them. The storage occurs once in the ColumnMapper.
When asked for a field value for a column, a Row asks the
ColumnMapper for the index (the index in its storage array) for the
field the column represents. Likewise for storing a field value. This
works because, since Row doesn't respond to these column ids as keys,
it KVC falls back to -valueForUndefinedKey: and our Row class
overrides this and relies on the central ColumnMapper to determine
where in its internal storage the value for that column ID is located.
This solves the sorting issue quite nicely too, if you sort using
descriptors. Since NSSortDescriptor uses KVC, it "just works". Don't
forget to google around for "Finder-like sorting" ... the built-in
methods make a mess of alphanumeric strings. I leave implementing that
to your imagination ... it's actually really easy if you spend a few
minutes with Google.
Note also this approach requires that all rows have the same number
of columns/fields. Your parsing logic will have to account for this by
either automatically adjusting (fraught with complexities and
assumptions) or rejecting the file and informing the user of the first
row where trouble begins - ie, the first row where the number of
fields/columns differ from the rest. You really should take this route
anyway, since the missing field in a row might be somewhere other than
the end ... so what do you do with the remaining fields in the row?
They are probably in the wrong column and there's no way to know
because of CSV's inherent lack of solid structure.
The only remaining problem is bindings. If you want to be able to
handle any CSV file (ie, the "fields" are unknown), I'm afraid there's
no way to use bindings in IB. You'll have to create the table columns
(and bind them) in code once you've parsed your file and determined
the number of columns. In this regard, you might find it just as easy
(if not easier) to eschew Cocoa Bindings altogether and just use the
NSTableDatasource protocol. It gives you more precise control over
what to refresh and when. Trust me, this will come up.
Of course for very large files, both methods will be slow (and
memory-intensive), and the problem becomes far more complex because
then you need to start considering low-level solutions that don't
ignore encodings. The anthesis to this concern is that, as the
complexity and size increase, the likelihood that a human will want to
see it as a table they will manually manipulate decreases (or at
least, the reasonableness of the request does). At that magic tipping
point, it's easy to argue that a GUI editor is no longer feasible and
most of this problem goes away.
Good luck and happy coding! :-)
--
I.S.
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden