So, I need to convert an NSString value of something like "http:///
www.apple.com", to a straight c-string and write that out. I've
tried a lot of different ways, but I keep ending up with lots of
binary 0's. I think there is something fundamental about bytes/
characters/etc. that I don't get.
Well the first thing you need to find out is what character encoding
should be used for that string. It should say somewhere in the spec
what you should be using (or there will be a field somewhere that
specifies the character encoding explicitly).
At the risk of doing rather too much for you (and because questions
about bytes and characters keep popping up so perhaps others will
find this post and find what they need here)...
I think the format you quote comes from the 3GPP "Timed text format"
specifications, which starts by explaining the encoding in section 5
(specifically, they say that you should use UTF-8 unless you start
with a UTF-16 BOM, in which case you can use UTF-16; also, they say
that text should be fully composed).
It also (section 5.17.1.5) says specifically that the URL is in UTF-8.
Assuming that you are building this structure up in an NSMutableData
called "boxData", and that you have already encoded startCharOffset
and endCharOffset, you might do something like:
NSData *utf8Data = [[string precomposedStringWithCanonicalMapping]
dataUsingEncoding:NSUTF8StringEncoding];
int len = [utf8Data length];
NSAssert (len < 256, @"Chapter URL is too long (you might want to
handle this differently)");
(You could equally use UTF8String and strlen(), though I'd prefer the
-dataUsingEncoding: method for this particular application I think.)
I'll also note that it's likely that you'll keep wanting to encode
strings if you're working with a spec like this, and that they will
often be in the same format (i.e. a length byte, followed by that
many bytes of data). You might consider adding a category on
NSMutableData, containing the following method (or something similar):
- (void)append3GPPString:(NSString *)string
{
NSData *utf8Data = [[string precomposedStringWithCanonicalMapping]
dataUsingEncoding:NSUTF8StringEncoding];
int len = [utf8Data length];
every time you want to encode a <length, string> pair like that.
Hopefully you should be able to compare that with what you have now
and see where you were going wrong. (You might also consider
improving the error handling in the above code; you probably don't
want it to just assert if the string is too long to fit... maybe
truncating it somehow [careful; it's UTF-8 encoded <http://
en.wikipedia.org/wiki/UTF-8>] or reporting an error to the user is
what you want.)