Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: How to parse a textfile ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to parse a textfile ?

Subject: Re: How to parse a textfile ?
From: has <email@hidden>
Date: Wed, 29 Sep 2004 18:17:19 +0100

Joseph Weaks wrote:

Ok, how about anyone want to take a stab at the format of a data file I
need to parse?

Here's code for a simple regex-based parser I had kicking around that reads a hierarchical book index into nested lists; not exactly what you want but it may give you some ideas:

-------

on parseIndex(str) set stack to {} set oldDepth to 0 script k -- list access speed kludge property entries : find text "^(\\t*)([^" & tab & "]+)\\t+(.+)$" in str using "\\1\\r\\2\\r\\3" with regexp, string result and all occurrences -- uses Satimage osax end script repeat with entry in k's entries set {indent, title, page} to entry's paragraphs set depth to (indent's length) + 1 if depth > oldDepth then if depth - oldDepth is not 1 then error "Bad nesting." -- (e.g. don't allow a tertiary element to be added to a primary element) set stack's beginning to {} -- push a new empty contents list onto stack else if depth < oldDepth then repeat (oldDepth - depth) times -- pop the top contents list from stack and add it to its container set itm to stack's first item set stack to rest of stack set stack's first item's last item's content to itm end repeat end if set stack's first item's end to {title:title, page:page, content:{}} -- add this entry to the top contents list set oldDepth to depth end repeat repeat (depth - 1) times -- pop the top contents list from stack and add it to its parent set itm to stack's first item set stack to rest of stack set stack's first item's last item's content to itm end repeat return stack's first item -- return primary contents list end parseIndex

-- TEST
set str to "Racing			8
	Rally		12
		European	12
		American	14
		African	15
	F1		18
		Japan	20
		European	22
	NASCAR		28
	Motorcycle		23
		Road	24
		Motocross	26
		American	27
		Japan	29
Concours			30
	Classics		31
		Pebble Beach	32
		Milan	34
	Hotrods		36
		American	38
		Other	41
	FastFurious		42
		American	43
		Japan	45
	Shade tree		47
New			50
Literature			60
	Magazines		62
		Automobile	63
		Hotrod	65
	Books		67"
parseIndex(str)

-------

Only really suited to data that's simple and fairly uniform and doesn't require complex validation, so if your own data doesn't match those criteria then you'll really want to use a more powerful parser. You could probably hand-roll a basic FSM (finite state machine) -based parser in AppleScript as your data doesn't look too complex, or else use a parser generator that takes a file describing the structure of your data and generates the parser code for you. e.g. There's various parser generators available for Python (which is fairly easy to learn coming from AppleScript), and Python scripts are easily called from AS via 'do shell script' or packaged into scriptable applications. Or maybe the venerable lex and yacc Unix tools, if you're comfortable with C and want to implement your parser as an osax - though that may be overkill here.

HTH

has
--
http://freespace.virgin.net/hamish.sanderson/
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden



Prev by Date:
Re: igPay atinLay, second pass

Next by Date:
Re: sort order

Previous by thread:
Re: How to parse a textfile ?

Next by thread:
Distinguish volumes with the same name?

Index(es):

Date
Thread