Re: Parsing a text file
Re: Parsing a text file
- Subject: Re: Parsing a text file
- From: email@hidden
- Date: Wed, 22 Aug 2001 10:01:54 -0400
On Tue, 21 Aug 2001 19:00:41 -0400, Matthew Fischer
<email@hidden> asked,
>
I have some text files which are saved individual e-mail messages. Each of
them has various types of header > information, followed by the following:
>
>
<password>a unique password of varying length</password>
>
<username>a unique username of varying length</username>
>
<message>some more unique text of varying length.
>
>
This one could be more then one line.
>
>
Blah, blah, blah.</message>
This looks like XML. You might see if the XML Tools scripting addition can just
solve your problems with no muddling around with the individual characters.
If that can't do the job, consider using regular expressions, and look at the
RegEx Commands scripting addition. You can pretty easily match the things you
want, although its a little trickier than just matching
<password>(.*)</password>
because RegEx matching is "greedy", and will match from the first <password> to
the last </password>. As long as the stuff in between the XML-style markers
doesn't contain "<", you can use
<password>([^<]*)</password>
I think that, as a matter of professional growth, everyone who writes scripts or
programs should learn about regular expressions. They are very handy, whether
you use the RegEx scripting addition, the GREP capability in BBEdit, or the
regular expressions built into Javascript. And I think anyone looking to the
future should consider XML. XML isn't a "learn it right now" thing, but if the
opportunity presents itself, you should take the plunge, rather than try to
sidestep it. As I understand, Mac OS X uses XML for many of its settings and
preferences files. So if this e-mail file really is XML, why not learn about
the language and find out about the tools already available, instead of grubbing
around with loops and characters and text item delimiters.
--
Scott Norton Phone: +1-703-299-1656
DTI Associates, Inc. Fax: +1-703-706-0476
2920 South Glebe Road Internet: email@hidden
Arlington, VA 22206-2768 or email@hidden