Re: Best tool for large (500MB) text manipulation
Re: Best tool for large (500MB) text manipulation
- Subject: Re: Best tool for large (500MB) text manipulation
- From: Jan Steinman <email@hidden>
- Date: Sun, 17 Oct 2004 08:33:29 -0700
From: Bruce Robertson <email@hidden>
I have a relative who deals with very large data files, for example
500MB
text files. He needs to find, extract, and manipulate data from them.
He has
been using BBEdit but is coming up against its limits. The application
happens to be analysis of stock trading data.
What sort of 'limits' is he hitting?
What sort of capabilities of BBEdit is he using?
What sort of CS knowledge does he have? (How much is he willing to
learn? :-)
Is this data long-lived, or transient?
There are TONS of non-GUI data manipulation tools in UNIX. I suspect
that, with half a gig of data, he's simply running into memory
problems. Adding RAM to his system may help in the short term.
But a long-term fix is going to involve NOT loading the entire file
into memory at once, which is what happens in BBEdit.
If he's willing to lean some geek-speak, he should probably look at
standard UNIX tools, like awk, sed, and grep. By learning just a bit
about Regular Expressions, he'll have a MUCH better time of "finding,
extracting, and manipulating data" with those tools. A more ambitious
(and more powerful) approach would be learning perl.
The key to these UNIX programs is that they operate on streams of data,
and can work well in a limited memory space. And there may well be some
ready-made solutions that are similar to his problem space that are
available under GPL or public domain. But they will require some
considerable commitment in learning time.
Yet another approach (also requiring more or less investment in
learning) would be a real database. A database would work better with
long-lived data -- "write once, read many" data. FileMaker Pro is
fairly easy to learn, but has serious shortcomings for the long term.
MySQL is free, but will require more learning time.
:::: If addiction is judged by how long a dumb animal will sit pressing
a lever to get a "fix" of something, to its own detriment, then I would
conclude that the Internet is far more addictive than cocaine. -- Rob
Stampfli
:::: Jan Steinman <http://www.Bytesmiths.com/Image/98-4880-34>
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden