Ok, I'm not sure what I am doing wrong, but it's late already so I can't focus on that anymore. I guess I'm putting one statement in a bad position related to one of the loops...
What I expect is to have <tu> blocks for each row in the Excel data and <tuv> block for each cell in a row, with <body> enclosing all that.
But what I have is <tu> enclosing everything (instead of each row, separately), and <body> enclosing that.
The other problem I find hard to solve is how to use NSXMLDocument to properly create the whole thing.
I'm cheating by creating <tmx> as just another element when it should be the root.
Also, I can't find a way to create the <?xml...> and <!doctype ...> lines at the top of the document, which I guess is related to the previous point...
Here is what I came up with.
use AppleScript version "2.4" -- Yosemite (10.10) or later
use framework "Foundation"
use scripting additions
# Excel contains tabular data with the following structure:
# First row is a list of language names
# Columns have translations of the first column in the language given in the first row
# en fr ja
# Airport Aéroport 空港
# Expressway Autoroute 高速道路
# etc.
# convert translation memory data to TMX format
# first row of myTMData is used as attribute values for the column languages
# each following row contains a number of translations of the first column, in various languages
# structure:
# <?xml...>
# <!DOCTYPE...>
# <tmx>
# <header>
# </header>
# <body>
# <tu>
# <tuv xml:lang="l1-L1"><seg>data1</seg></tuv>
# <tuv xml:lang="l2-L2"><seg>data1</seg></tuv>
# <tuv xml:lang="l3-L3"><seg>data1</seg></tuv>
# </tu>
# <tu>
# <tuv xml:lang="l1-L1"><seg>data2</seg></tuv>
# <tuv xml:lang="l2-L2"><seg>data2</seg></tuv>
# <tuv xml:lang="l3-L3"><seg>data2</seg></tuv>
# </tu>
# <tu>
# <tuv xml:lang="l1-L1"><seg>data3</seg></tuv>
# <tuv xml:lang="l2-L2"><seg>data3</seg></tuv>
# <tuv xml:lang="l3-L3"><seg>data3</seg></tuv>
# </tu>
# ...
# </body>
# </tmx>
tell application "Microsoft Excel"
tell first worksheet of active workbook
set myTMData to the string value of used range
end tell
end tell
# language codes to serve as <tuv>'s xml:lang attributes
set theLANGAttribute to list 1 of myTMData
# srclang attribute for the header
set myTMsrclang to item 1 of theLANGAttribute
# creationdate attribute for the header
set myDate to current date
set myDateString to short date string of myDate & " " & time string of myDate
# creation of the root elements (<tmx>) and of it's only attribute
set tmxRoot to current application's NSXMLNode's elementWithName:"tmx"
set tmxVersion to (current application's NSXMLNode's attributeWithName:"version" stringValue:"1.4")
(tmxRoot's addAttribute:tmxVersion)
# creation of the <header> element and of it's many attributes
set tmxHeader to current application's NSXMLNode's elementWithName:"header"
set headerCreationTool to (current application's NSXMLNode's attributeWithName:"creationtool" stringValue:"xls2tmx")
set headerCreationToolVersion to (current application's NSXMLNode's attributeWithName:"creationtoolversion" stringValue:"0.1")
set headerSrcLang to (current application's NSXMLNode's attributeWithName:"srclang" stringValue:myTMsrclang)
set headerAdminLang to (current application's NSXMLNode's attributeWithName:"adminlang" stringValue:"en")
set headerDatatype to (current application's NSXMLNode's attributeWithName:"datatype" stringValue:"unknown")
set headerOtmf to (current application's NSXMLNode's attributeWithName:"o-tmf" stringValue:"Microsoft Excel")
set headerSegtype to (current application's NSXMLNode's attributeWithName:"segtype" stringValue:"paragraph")
set headerCreationDate to (current application's NSXMLNode's attributeWithName:"creationdate" stringValue:myDateString)
(tmxHeader's addAttribute:headerCreationTool)
(tmxHeader's addAttribute:headerCreationToolVersion)
(tmxHeader's addAttribute:headerSrcLang)
(tmxHeader's addAttribute:headerAdminLang)
(tmxHeader's addAttribute:headerDatatype)
(tmxHeader's addAttribute:headerOtmf)
(tmxHeader's addAttribute:headerSegtype)
(tmxHeader's addAttribute:headerCreationDate)
# creation of the <body> element, no attributes
set tmxBody to current application's NSXMLNode's elementWithName:"body"
# creation of the <tu> element, no attributes
set TUElement to (current application's NSXMLNode's elementWithName:"tu")
# processing the TM data, from the second row
repeat with i from 2 to length of myTMData
# myTU is a given row, contains as many <tuv> elements as there are items in rows of myTMData
set myTU to item i of myTMData
repeat with j from 1 to length of myTU
# myTUV is a given item in a myTU
set myTUV to item j of myTU
# myTUV's xml:lang attribute will be the item with the same index in theLANGAttribute
set myLANGAttribute to item j of theLANGAttribute
set newTuv to (current application's NSXMLNode's elementWithName:"tuv")
set newAttribute to (current application's NSXMLNode's attributeWithName:"xml:lang" stringValue:myLANGAttribute)
(newTuv's addAttribute:newAttribute)
# newSeg is the segment that for that <tuv>
set newSeg to (current application's NSXMLNode's elementWithName:"seg" stringValue:myTUV)
(newTuv's addChild:newSeg)
# I'm guessing the issue comes from the following line's position in the script
# But since wherever I put it I can't get the thing to wrap <tu> around each row, there must be some other problem...
(TUElement's addChild:newTuv)
end repeat
end repeat
(tmxBody's addChild:TUElement)
(tmxRoot's addChild:tmxHeader)
(tmxRoot's addChild:tmxBody)
return tmxRoot's XMLString() as text