Text parsing
Text parsing
- Subject: Text parsing
- From: Bill Monk <email@hidden>
- Date: Tue, 2 Mar 2004 01:27:15 -0600
I'm trying to extract useful text from a larger chunk. It's probably
easy, however I seem to be stuck.
Tge text chunks are around 2000 characters each (I have no control over
the contents).
Lines are delimited by carriage returns.
Lines may contain anything, typically URLs made up mostly of numbers.
10-12 lines in each blob are of interest.
Interesting lines always contain a semaphone word, enclosed in square
brackets.
The semaphore is the same for every interesting line within any given
chunk.
The semaphore always precedes interesting text, however it may not be the
first thing on a line.
Examples:
foo [person] 1. My Name (And Address)
bar [song] 11. Song Title (By The Guy Who Wrote It)
Interesting lines have these "fields"
[variable # of garbage chars]
[variable # of spaces]
[semaphore word enclosed in brackets]
[variable # of spaces]
[one or two digits]
[a period]
[variable # spaces]
[variable # of words of interesting text, call it A]
[variable # spaces]
[optional open parenthesis]
[if open paren is present, variable # of words of interesting text, call
it B]
[optional close paren]
Suggestions?
For each interesting line, I want to put interesting text "A" into an
item in a list, and interesting text "B" into a corresponding item in a
second list (or empty string if no text "B").
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.