• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: How to parse a textfile ?
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to parse a textfile ?


  • Subject: Re: How to parse a textfile ?
  • From: has <email@hidden>
  • Date: Wed, 29 Sep 2004 18:17:19 +0100

Joseph Weaks wrote:

Ok, how about anyone want to take a stab at the format of a data file I
need to parse?


Here's code for a simple regex-based parser I had kicking around that reads a hierarchical book index into nested lists; not exactly what you want but it may give you some ideas:

-------

on parseIndex(str)
set stack to {}
set oldDepth to 0
script k -- list access speed kludge
property entries : find text "^(\\t*)([^" & tab & "]+)\\t+(.+)$" in str using "\\1\\r\\2\\r\\3" with regexp, string result and all occurrences -- uses Satimage osax
end script
repeat with entry in k's entries
set {indent, title, page} to entry's paragraphs
set depth to (indent's length) + 1
if depth > oldDepth then
if depth - oldDepth is not 1 then error "Bad nesting." -- (e.g. don't allow a tertiary element to be added to a primary element)
set stack's beginning to {} -- push a new empty contents list onto stack
else if depth < oldDepth then
repeat (oldDepth - depth) times -- pop the top contents list from stack and add it to its container
set itm to stack's first item
set stack to rest of stack
set stack's first item's last item's content to itm
end repeat
end if
set stack's first item's end to {title:title, page:page, content:{}} -- add this entry to the top contents list
set oldDepth to depth
end repeat
repeat (depth - 1) times -- pop the top contents list from stack and add it to its parent
set itm to stack's first item
set stack to rest of stack
set stack's first item's last item's content to itm
end repeat
return stack's first item -- return primary contents list
end parseIndex


-- TEST
set str to "Racing			8
	Rally		12
		European	12
		American	14
		African	15
	F1		18
		Japan	20
		European	22
	NASCAR		28
	Motorcycle		23
		Road	24
		Motocross	26
		American	27
		Japan	29
Concours			30
	Classics		31
		Pebble Beach	32
		Milan	34
	Hotrods		36
		American	38
		Other	41
	FastFurious		42
		American	43
		Japan	45
	Shade tree		47
New			50
Literature			60
	Magazines		62
		Automobile	63
		Hotrod	65
	Books		67"
parseIndex(str)

-------

Only really suited to data that's simple and fairly uniform and doesn't require complex validation, so if your own data doesn't match those criteria then you'll really want to use a more powerful parser. You could probably hand-roll a basic FSM (finite state machine) -based parser in AppleScript as your data doesn't look too complex, or else use a parser generator that takes a file describing the structure of your data and generates the parser code for you. e.g. There's various parser generators available for Python (which is fairly easy to learn coming from AppleScript), and Python scripts are easily called from AS via 'do shell script' or packaged into scriptable applications. Or maybe the venerable lex and yacc Unix tools, if you're comfortable with C and want to implement your parser as an osax - though that may be overkill here.

HTH

has
--
http://freespace.virgin.net/hamish.sanderson/
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


  • Prev by Date: Re: igPay atinLay, second pass
  • Next by Date: Re: sort order
  • Previous by thread: Re: How to parse a textfile ?
  • Next by thread: Distinguish volumes with the same name?
  • Index(es):
    • Date
    • Thread