For the moment I went with the mtrans() function. If I am following
correctly, the section on the developer site regarding matrix
transpose is what the mtrans function is doing (and is essentially
what Holger had suggested as well).
Ian I'm not certain I am following the statement you made (pasted
below). Could you, or someone else, elaborate a bit?
The thing you want to avoid is striding through data at multiples of
64kB. If you are striding vertically (or horizontally in FORTAN)
through an array and each row (or column in FORTRAN) is a multiple of
64 kB wide (tall) then you will be striding through data at multiples
of 64 kB. To fix this, the number of bytes in each row (or column) of
data should not be a multiple of 65536 bytes. If it is, allocate your
matrix to be a bit wider (taller) and do not use the extra columns
(rows).