Re: How to parse a textfile ?
Re: How to parse a textfile ?
- Subject: Re: How to parse a textfile ?
- From: has <email@hidden>
- Date: Wed, 29 Sep 2004 18:17:19 +0100
Joseph Weaks wrote:
Ok, how about anyone want to take a stab at the format of a data file I
need to parse?
Here's code for a simple regex-based parser I had kicking around that
reads a hierarchical book index into nested lists; not exactly what
you want but it may give you some ideas:
-------
on parseIndex(str)
set stack to {}
set oldDepth to 0
script k -- list access speed kludge
property entries : find text "^(\\t*)([^" & tab &
"]+)\\t+(.+)$" in str using "\\1\\r\\2\\r\\3" with regexp, string
result and all occurrences -- uses Satimage osax
end script
repeat with entry in k's entries
set {indent, title, page} to entry's paragraphs
set depth to (indent's length) + 1
if depth > oldDepth then
if depth - oldDepth is not 1 then error "Bad
nesting." -- (e.g. don't allow a tertiary element to be added to a
primary element)
set stack's beginning to {} -- push a new
empty contents list onto stack
else if depth < oldDepth then
repeat (oldDepth - depth) times -- pop the
top contents list from stack and add it to its container
set itm to stack's first item
set stack to rest of stack
set stack's first item's last item's
content to itm
end repeat
end if
set stack's first item's end to {title:title,
page:page, content:{}} -- add this entry to the top contents list
set oldDepth to depth
end repeat
repeat (depth - 1) times -- pop the top contents list from
stack and add it to its parent
set itm to stack's first item
set stack to rest of stack
set stack's first item's last item's content to itm
end repeat
return stack's first item -- return primary contents list
end parseIndex
-- TEST
set str to "Racing 8
Rally 12
European 12
American 14
African 15
F1 18
Japan 20
European 22
NASCAR 28
Motorcycle 23
Road 24
Motocross 26
American 27
Japan 29
Concours 30
Classics 31
Pebble Beach 32
Milan 34
Hotrods 36
American 38
Other 41
FastFurious 42
American 43
Japan 45
Shade tree 47
New 50
Literature 60
Magazines 62
Automobile 63
Hotrod 65
Books 67"
parseIndex(str)
-------
Only really suited to data that's simple and fairly uniform and
doesn't require complex validation, so if your own data doesn't match
those criteria then you'll really want to use a more powerful parser.
You could probably hand-roll a basic FSM (finite state machine)
-based parser in AppleScript as your data doesn't look too complex,
or else use a parser generator that takes a file describing the
structure of your data and generates the parser code for you. e.g.
There's various parser generators available for Python (which is
fairly easy to learn coming from AppleScript), and Python scripts are
easily called from AS via 'do shell script' or packaged into
scriptable applications. Or maybe the venerable lex and yacc Unix
tools, if you're comfortable with C and want to implement your parser
as an osax - though that may be overkill here.
HTH
has
--
http://freespace.virgin.net/hamish.sanderson/
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden