Alas, when pasting in TextEdit I discovered that the code only
grabbed the first two bytes 01D1 giving to me an infamous
(LATIN CAPITAL LETTER O WITH CARON whose code is (01D1)
when I wanted a
MUSICAL SYMBOL DOUBLE FLAT (01D12B).
I'm pretty sure the native encoding used by OS X is UTF-16, which
means that in order to deal with code points above U+FFFF, you have to
use surrogate pairs. Basically, you generate two separate code
points representing a single scalar value; each of the code points is
essentially a single "digit" in base 1024, meaning you can represent
1024x1024=1,048,576 characters that way. Add the 65,536 characters in
the basic multilingual plane and you get the full Unicode repertoire
of 1,114,412 characters.
The UTF-16 representation of U+1D12B consists of U+D834 followed by
U+DD2B. Here's how you get that from the scalar value (1D12B hex =
119,083 decimal).
1. Subtract 10000 hex = 65,536 decimal; this is the scalar value
represented by a surrogate pair number of zero. This is easy in the
hexadecimal version - just drop the leading 1. The result is D12B.
2. To convert to "base 1024", just divide and keep the quotient and
remainder. D12B hex = 53547 decimal. Divide by 1024 and you get 52
with a remainder of 299. So the two "base 1024 digits" we want to
output are 52 and 299 decimal; hex will be easier to work with, so
that's 34 and 12B
3. The first digit comes from the high surrogates area, which starts
at D800. Just add: D800 + 34 = D834. That's the first "character"
output.
4. The second digit comes from the low surrogates area at DC00. DC00
+ 12B = DD2B.