Extracting from text [Lecture]
Extracting from text [Lecture]
- Subject: Extracting from text [Lecture]
- From: Nigel Garvey <email@hidden>
- Date: Thu, 28 Feb 2002 18:29:02 +0000
Apologies for this. :-)
It's now generally well known amongst scripters that constructions like:
(characters 5 thru 137 of myText) as string
... can be horrendously wasteful in terms of the memory, processor
cycles, and typing required, and that the equivalent 'text' command is to
be preferred:
text 5 thru 137 of myText
The 'as string' construction creates an intermediate, 133-item list,
consisting of 133 single-character strings (double-character strings if
they're zero-terminated as they were on my Atari) and 133 pointers - each
of which is probably longer than the string to which it points. This list
is then immediately abandoned in favour of the string that's then derived
from it. The 'text' line, on the other hand, simply copies the derived
string directly from the original.
Although this is, as I've said, now well known, I've seen several
well-respected scripters over the past week give the following advice in
respect of path strings:
set AppleScript's text item delimiters to {":"}
set parentPath to (text items 1 thru -3 of folderPath) as string
This isn't perhaps as potentially calamitous as listing and then coercing
the individual characters of a large text, but its use does seem to
suggest that the authors don't know about the equivalent 'text'
construction, which is (in its simplest form):
text 1 thru text item -3 of folderPath
The 'text' command - like its list-producing cousins 'characters',
'words', 'paragraphs', and 'text items' - has very powerful and flexible
boundary parameters. A plain number parameter refers to a unit of the
same type as the command (which is a character in the case of either
'text' or 'characters'). A qualifier is used when some other unit is to
be understood instead. Possible qualifiers are 'character', 'word',
'paragraph', and 'text item'. The example above means "text from
character 1 up to and including the third text item from the end. There's
an alternative syntax for this:
text from 1 to text item -3 of folderPath
... though this will compile to the 'thru' form if the first parameter
doesn't have a qualifier. This alternative form though is the only option
if the first parameter *does* have a qualifier. For instance, if you
wanted the folder path string without the volume name:
text from text item 2 to -1 of pathString
... or the parent path without the volume name:
text from text item 2 to text item -3 of pathString
The order of the 'from' and 'to' parameters isn't particularly important.
The result always reads from the item nearer the beginning of the
original string to the item nearer the end:
set t to "This is a some text"
text from word 3 to 2 of t -- same as text 2 thru word 3 of t
--> "his is a"
An exception to this is when one boundary parameter is within the other.
In this case, the order does make a difference:
set t to "'Antidisestablishmentarianism' is a long word"
text 5 thru word 1 of t
--> "idisestablishmentarianism"
text from word 1 to character 5 of t
--> "Anti"
The parameters 'beginning' and 'end' can be used instead of 1 and -1 to
denote the beginning and end of the source string
The boundary parameters for 'text' can also be used with 'characters',
'words', 'paragraphs', and 'text items':
set t to "This is a piece of
text which contains
three so-called 'paragraphs'."
words from character 11 to paragraph 2 of t
--> {"ece", "of", "text", "which", "contains"}
And finally, of course, for those who prefer, the Saxon genitive can be
used in place of 'of'.
t's text from paragraph 2 to the end
NG
PS. It seems you *can* after all use the 'thru' syntax when the first
parameter's qualified, but you have to bracket the parameter to get it to
compile properly:
t's text (paragraph 2) thru paragraph 19
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.