Quark: production scripts [long]
Quark: production scripts [long]
- Subject: Quark: production scripts [long]
- From: Michael Turner <email@hidden>
- Date: Tue, 24 Apr 2001 15:05:11 -0400
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX [long] XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Dear list,
I have made tremendous progress on a project regarding QuarkXPress documents
export to Xtags. This list has been of tremendous assistance in creating
these scripts. Thanks to everyone who has contributed to that process. Hans
& JJ come quickly to mind.
I am going to ask "What next" in this post. I have a functional script. It
basically does what I want. But, I am looking for suggestions for how to
improve this process "fundamentally".
To do that I need to give an overview of the project. Here it is.
A QuarkXpress production script
I had a project which landed on my lap very quickly, which I had to respond
to in just a couple weeks. I don't know Applescript, but I knew its
capabilities and had written some useful dropplets, but didn't (still don't)
feel ready for a large scale project. I recommended Applescript initially,
but my recommendation was rejected... then the alternative failed, and I was
assigned my own recommendation.
I wrote these scripts very quickly and without much knowledge of
Applescript. So I expect they are faulty. Perhaps even fundamentally flawed
(although, I have the results I need). Overall, I am very pleased with the
results. The scripts work. So, why show them to the list? Because I need
better understanding of what I am doing. The scripts have several flaws
that I can not fix. First, they are too slow. Second, they are not
comprehensive enough. By that I mean they could do more automatically than I
am currently achieving.
Anyway, what is this project?
An overview: take thousands of pages set in QuarkXPress Macintosh to HTML.
Starting with a large number of QuarkXPress documents, my goal is to export
all of the materials to a rich ASCII format. Xtags is my preferred export,
because it preserves info about italic, font, etc. Also, I collect
infomation about the box itself. Colleagues take my export and create web
format -- HTML, based mostly on the Quark "styles". Images are exported as
an image "path" along with scaling information.
Note: Beyond press was investigated and found lacking. It creates HTML from
Quark, but it makes a mess of the documents. We needed consistent and high
quality materials. Other web export techniques from Quark were also found
lacking for the similar reasons. We want to roll our own export technique.
My material is mostly mathematics textbooks. Calculus, algebra, discrete
mathematics, etcetra. In order to enable high fidelity mathematics to be set
inside a QuarkXPress document, the compositors used one of several
"Xtensions": PowerMath or MathSetter. An Xtension is a QuarkXPress plugin
that expands the capability of QuarkXPress. These Xtensions have to be
"exported" seperately from the document contents. Their export language is
usually "Tex". Export from one of these Xtensions creates ASCII text which
is embedded in the Quark doc at the point of insertion.
Here is where one of my bugs comes in. The ASCII exported by MathSetter is
not recognized by Quark as normal text. When I export this material to
Xtags, all of the PowerMath material is dumped as "null". However, when I
export to ASCII from Quark, the PowerMath material is recognized and
delivered intact as ASCII characters. So I export twice: once to xtags and
once to ASCII. These two files are then merged, to create a document with
both rich font info and TEX mathematical formula info.
So, my process is to start at the end of a Quark document, exporting each
section via script. I make a selection of text up until an image is
mentioned in the text. Then I capture the caption, then the image then back
to the text. Tables work in a similar manner. Which is to say, I am creating
a logical flow from the textbook. Each section consists of the next logical
part of the page. There is not logical flow in Quark, so I have to add the
logic to the page by exporting in order and with several different
parameters: text, caption, image or table. I start at the bottom of the
page, deleting material as I go to avoid repeatedly exporting materials.
Basically, I un-composit Quark files. My result is a series of ASCII text
files which I deliver to the programmer next door. The files are marked with
page number ([0-9]*) Caption/Text/Image/tAble ([ctia] "a" stands for tAble)
and and element position ([0-9]*) the element position runs backwards
because I export page from the bottom. So element 10 is the last position on
the page (I used mod 10). File name also indicates xtags or ascii file
([ax]). There is one of each ascii & xtags for each caption & each text file
output, regardless of wether there is an extension object in the flow.
OK! Now the scripts... I will be showing everyone "textXport" and
"picXport".
Known flaws: textXport doesn't handle nested text boxes. I have to stop the
selection at each nested text box. picXport must be engaged after the image
is selected, rather than running throught the document as a whole converting
the whole shebang. If I could convert the images to text & export nested
text, I could export much larger sections at a time, only stopping to add
logical text flow information.
Oh, and things I forgot to mention: I am keeping information about the
script output position (page # and element #) in an invisible resource file
known to my scripts and editable by my scripts. I have another script to
initailize this "data structure". But it isn't very interesting. Another
script presents this information should I need it. Several simple scripts
help to position elements on the page so that I can move through the
document faster. But those scripts are simple and can be ignored.
All of these scripts are activated by CE software's QuickKeys.
I have marked 'long line's
--------[script textXport]--------
set scriptPath to "Macintosh HD:Desktop Folder:scripts:resource"
-- hidden resource which contains parameters
-- for page number (currentPage) & element number (elementNum)
set theScript to (load script alias scriptPath)
-- accesses this hidden script
set currentStates to States of theScript
-- set this script to the parameters stored
-- in hidden resource script
set currentPage to item 1 of currentStates as integer
-- page last captured from
set elementNum to (item 2 of currentStates as integer) + 10
-- get element number & increment to avoid overwriting file
set sVersion to item 3 of currentStates as real
-- track version number of script & later store in ascii output file.
set thePath to item 4 of currentStates
-- thePath here is the folder to place output files.
set addTag to item 6 of currentStates as string
-- markup tag to idicate "additional information"
-- exported with each output file (2 output files per run)
set addTagEnd to item 7 of currentStates as string
-- end of additional markup tag. I stashed it in the invisible
-- resource file because all of my scripts use this tag.
-- If I want to change it, it is convienient.
set item 2 of currentStates to elementNum
-- increment the element position in the resource file
tell application "QuarkXPress(TM) 4.1" (* actually 4.11 *)
activate
tell document 1
set menuInfo to Query Menu menu title "MathSetter"
-- this script works with documents
-- that have the MathSetter Xtension.
set menuItemInfo to item 3 of menuInfo
set menuItemText to menu item text of item 3 of menuInfo
if (menuItemText is "UnTypeset") then
set menuEnabled to true
-- If there is a MathSetter object,
-- then the menu will be enabled.
-- Otherwise, I don't have to activate
-- the Xtension's export function.
end if
set pageNumber to page number of current page
-- not the same thing as currentPage
-- currentPage is the actual page number
-- as stored in the hidden resource file
-- pageNumber is a different value,
-- because Quark can have 2 page number 10s.
-- this value is stored so I can zoom to the
-- correct position. And so that if the script
-- breaks, it will be obvious by yanking me to an
-- incorrect position.
if not (currentPage = (name of current page as integer)) then
-- if currentPage and doesn't match the
-- actual current page, then I have moved to a new page.
-- All of the parameters have to be reset.
set currentPage to name of current page as integer
set elementNum to 10
--
set item 1 of currentStates to currentPage
set item 2 of currentStates to elementNum
-- update resource file versions
end if
cut
-- there must be a selection made before this cut
-- (no error check?)
set boxBounds to bounds of current box
-- save for later reference
make text box at the beginning of the current page
with properties {bounds:boxBounds, name:"FredTheBox"}
-- long line
set boxProperties to ((properties of current box))
set boxProperties to MakeValueText (boxProperties)
-- convenient record to string conversion
-- so that I can export this info with my output.
set elementString to "" & currentPage & "t" & elementNum
-- file name: page number, t to indicate "text"
-- and element number. output type "x" & "a" added later.
set addString to return & addTag &
"<WIDTH=" & (width of bounds of current box) & ,
">" & "<HEIGHT=" & (height of bounds of current box)
& ">" & ,
"<BOXPROPERTIES=" & boxProperties & ">" & ,
"<VERSION=" & sVersion & ,
">" & addTagEnd & return
-- long line
-- this string is appended to the end of each output.
set selection to text of story 1 of text box "FredTheBox"
of current spread
-- long line
paste
-- dump selected text into an identical box
-- laid on top of selected box.
TypeText {ASCII character 32} -- [space]
-- prevents wierd bug where the last exported
-- ascii output of an Xtension is ignored by AS "after" command.
show page pageNumber
-- jump to the page where the export is SUPPOSED to take place
set selection to text of story 1 of
text box "FredTheBox" of current spread
-- long line
-- now select newly deposited text, so that the
-- Xtension export can be triggered.
if not story 1 of text box "FredTheBox" of current spread = "" then
try
if (menuEnabled is true) then
Select Menu Item menu title "MathSetter" menu item ID 3
with option key
-- long line
-- Option-UntypeSet
-- trigger for Xtension export
-- {with special option key activated,
-- very cool OSA Menu function.}
end if
on error
-- do nothing
-- oh, oh, someone is going to jump on me for this.
-- I suspect the try block is unnecessary.
end try
try
tell text box "FredTheBox" of current spread
set after story 1 to addString
-- addString is the additional material (boxProperties)
-- set addString to end of story
end tell
on error
display dialog "Script Broke!"
-- never happens.
end try
save story 1 of current box in (thePath & elementString & "a")
-- (from Hans) this will export ASCII text.
-- This was harder to find and implement than
-- it should have been.
end if
activate
-- Make quark active again.
show page pageNumber
-- jump back to proper page (or break obviously)
---------------------------------------------------------------
set selection to text of story 1 of
text box "FredTheBox" of current spread
-- long line
-- this is repeating sequence from above: create new box,
-- paste materials into it, then export
-- Xtension with different settings.
paste
-- pasting in copy of selected material cut above.
-- This is the identical materials, unaltered.
TypeText {ASCII character 32} -- [space]
-- prevents wierd bug where the last exported
-- ascii output of an Xtension is ignored by AS "after" command.
show page pageNumber
-- jump to correct position, or break obviously.
-- (probably redundant)
set selection to text of story 1 of
text box "FredTheBox" of current spread
-- long line
-- reselect materials in box.
if not story 1 of text box "FredTheBox" of current spread = "" then
try
if (menuEnabled is true) then
Select Menu Item menu title "MathSetter"
menu item text "UnTypeset"
-- long line
-- no "option" this time, export similar otherwise
-- (this time the export is _not_ "latex")
end if
on error
-- do nothing
-- good materials for a flame...
end try
try
tell text box "FredTheBox" of current spread
set after story 1 to addString
--set addString to end of story
-- addString is the additional materials
-- collected earlier about the box (boxProperties)
end tell
on error
display dialog "Script Broke!"
-- doesn't happen
end try
save story 1 of current box as "TEXT" in
(thePath & elementString & "x")
-- long line
-- xtag export
end if
show page pageNumber
-- jump to proper page, or break obviously.
set tool mode to drag mode
cut
-- deletes the "extra" box created on top of current box
set tool mode to contents mode
-- return to right tool for next selection.
----------------------------------------
show page pageNumber
-- jump to proper page, or break obviously,
-- almost certainly redundant.
beep
activate
end tell
end tell
set States of theScript to currentStates
store script theScript in (alias scriptPath) replacing yes
-- save the values stored in hidden resource file
--------[/script]--------
--------[script picXport]--------
-- same initializing script as "textXport"
set scriptPath to "Macintosh HD:Desktop Folder:scripts:resource"
-- hidden resource which contains parameters for
-- page number (currentPage) & element number (elementNum)
set theScript to (load script alias scriptPath)
-- accesses this hidden script
set currentStates to States of theScript
-- set this script to the parameters stored in hidden resource script
set currentPage to item 1 of currentStates as integer
-- page last captured from
set elementNum to (item 2 of currentStates as integer) + 10
-- get element number & increment to avoid overwriting file
set sVersion to item 3 of currentStates as real
-- track version number of script & later store in ascii output file.
set thePath to item 4 of currentStates
-- thePath here is the folder to place output files.
set addTag to item 6 of currentStates as string
-- markup tag to idicate "additional information"
-- exported with each output file (2 output files per run)
set addTagEnd to item 7 of currentStates as string
-- end of additional markup tag. I stashed it in the invisible
-- resource file because all of my scripts use this tag.
-- If I want to change it, it is convienient.
set item 2 of currentStates to elementNum
-- increment the element position in the resource file
tell application "QuarkXPress(TM) 4.1" (* actually its 4.11 *)
activate
set pageNumber to page number of current page
-- not the same thing as currentPage
-- currentPage is the actual page number
-- as stored in the hidden resource file
-- pageNumber is a different value,
-- because Quark can have 2 similar page numbers.
-- this value is stored so I can zoom to the
-- correct position. And so that if the script
-- breaks, it will be obvious by yanking me to an incorrect position.
if not (currentPage = (name of current page as integer)) then
-- if currentPage and doesn't match the
-- actual current page, then I have moved to a new page.
-- All of the parameters have to be reset.
set currentPage to name of current page as integer
set elementNum to 10
set item 1 of currentStates to currentPage
set item 2 of currentStates to elementNum
end if
tell every image of current box
set filePath to "" & file path
-- file path gives me most everything I
-- need to know: a unique name for the image
set imageScale to scale as list
-- almost never used, embedded in text file anyway.
set imageProperties to its properties
-- ditto
end tell
-- repeat block despite only addressing a single selection. (error?)
-- achieved largely from trial and error
set xScle to (item 1 of imageScale) as text
set yScle to (item 2 of imageScale) as text
set boxBounds to bounds of current box
set boxProperties to ((properties of current box))
set boxProperties to MakeValueText (boxProperties)
set imageProperties to MakeValueText (imageProperties)
set elementString to currentPage & "i" & elementNum
set addString to return & addTag &
"<WIDTH=" & (width of bounds of current box) & ,
">" & "<HEIGHT=" & (height of bounds of current box) & ">" & ,
"<XSCALE=" & xScle & ">" & "<YSCALE=" & yScle & ">" & ,
"<BOXPROPERTIES=" & boxProperties & ">" & ,
"<IMAGEPROPERTIES=" & imageProperties & ">" & ,
"<VERSION=" & sVersion & ">" & addTagEnd & return
-- long line
tell document 1
set horizontal measure to millimeters
set vertical measure to millimeters
-- I preferr millimeters for the sizing of my graphics.
(* picas have been misinterpreted many
times by the programmers following me in the process.*)
make text box at beginning of spread 1 with properties ,
{bounds:boxBounds}
-- long line
set text of story 1 to filePath & addString
-- I don't need much in this export file.
-- addString is the box info collected earlier
-- boxProperties and xScle & yScle etc.
if not story 1 = "" then
save story 1 as "TEXT" in (thePath & elementString & "x")
end if
-- xport as xtags if there is a story
-- (there is always a story...)
-- why xtags...
-- because I started doing it that way to begin with
-- and now don't dare change.
set tool mode to drag mode
cut
-- deletes the current box (its not anchored)
set selection to text of story 1
set tool mode to drag mode
cut
-- deletes the box including image (anchored)
set tool mode to contents mode
-- to allow me to continue making selections.
show page pageNumber
-- jump to the right page or break obviously
activate
beep
end tell
end tell
set States of theScript to currentStates
store script theScript in (alias scriptPath) replacing yes
--------[/script]--------
I'll send anyone who asks a couple sample files. But they are way to long to
include here.
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX [long] XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
My apologies for such a long post. I hope it is of interests to at least a
few readers.