Mailing Lists: Apple Mailing Lists

Image of Mac OS face in stamp
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: AppleSingle/AppleDouble & character encoding



On 10/16/05 at 10:11 PM, James W. Walker <email@hidden> wrote:

> Is there any documentation (or sample code) for AppleSingle/
> AppleDouble newer than RFC 1740?  The RFC does not mention character
> encoding, and it's kind of important to know the encoding of the file
> name.

The things I'm forced to remember that I wish I could forget.

There's nothing *newer* than RFC 1740 because the format was
established several years earlier, with version 1.0 in 1988 or so, and
version 2.0 (what everyone deals with today) in 1990.  The summary in
Appendix A of RFC 1740 is just that - a summary, not a spec.

Version 1.0 is probably not what you want to use.  I documented version
1.0 over 15 years ago in an Apple II File Type Note, available at
<http://www.nulib.com/library/FTN.e00001.htm> for AppleSingle and
<http://www.nulib.com/library/FTN.e000023.htm> for AppleDouble.  Note
that this was for Apple II developers, so big-endian structures are
described as "Reverse," because the Apple II was a little-endian
system.

I somehow cannot come up with an explanation for why the Mac OS uses
this format, but Apple itself has failed to document version 2.0.  The
original documentation was a printed document from APDA, and then made
available on the Developer CDs in "AppleLink Image Format."

I have a version of that document that someone at Apple converted to
PDF in 1995, and since it is not for sale anywhere and seems to be
somewhat important, I've put it on our Web site until someone from
Apple tells us not to:

<http://www.macjournals.com/special/AppleSingle-Double.pdf>

My memory is that we didn't update the free File Type Note because
Apple II developers didn't need to read or write version 2.0 files -
the revision didn't buy them anything that version 1.0 didn't already
have, so we just didn't waste the time on it.  Otherwise, the info
would be available for free already.  I think the decision was that
we'd keep the File Type Note at 1.0 and the new document for 2.0, so
people could find both versions if they wanted to.  Funny how it didn't
work out that way.

---

As to the *specific* question, as you can see, the document in question
never once mentions character encoding.  I think you're talking about
entry 3, "Real Name," which is supposed to be the "file's name as
created on home file system."  On numbered page 3 (physical page 7),
the document explains:

> The home file system is the primary file system for which the file's
> contents were created. The home file system is not necessarily the
> file system in which the file was created. For example, if a program
> running on a UNIX(R) system creates a file that holds a MacWrite(R)
> document, the file's home file system is the Macintosh file
> system—not the UNIX file system—because the file's contents are
> formatted for a Macintosh application.

Because of this, and based on my memories in documenting this, I would
say that the character encoding for such a filename is presumed to be
MacRoman.  The spec was written before storing script encodings with
file references was common, and version 2.0 of AppleSingle simply does
not store a text encoding.

Since it was created for A/UX, which was almost always presumed to be
running in MacRoman, I have to conclude that most software reading
AppleSingle/Double files expects entry 3 to be MacRoman (or ASCII,
which is a MacRoman subset).  I don't remember any discussion about
alternate encodings.

The File Type Note says this about entry 3:

> The Real Name entry indicates the file's original filename in the
> host file system.  This is not a Pascal or C string; it is just ASCII
> data.  The length is indicated by the Entry Length field for the Real
> Name entry.

I don't think I was saying that the name was *limited* to ASCII, and
the version 2.0 note explicitly talks about non-ASCII characters, with
all the examples using MacRoman character encoding values.  However,
ASCII should always work.

There are plenty of HFS Plus filenames that AppleDouble 2.0 can't
represent.  My understanding is that Apple made internal revisions to
the format for the system's use, but that these changes have never been
documented.  You might be able to check out the UFS file system code
from Darwin and see how it implements multiple forks and extra file
attributes, because there's probably some AppleDouble in there, but by
nature, such may change without notice.

This would be an excellent thing to put in RADAR.

--Matt

--
Matt Deatherage                              <email@hidden>
GCSF, Incorporated                      <http://www.macjournals.com>

"Success is the ability to go from one failure to another with no loss of
 enthusiasm."  -- Sir Winston Churchill


 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Carbon-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/carbon-dev/email@hidden

This email sent to email@hidden

References: 
 >AppleSingle/AppleDouble & character encoding (From: "James W. Walker" <email@hidden>)



Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2007 Apple Inc. All rights reserved.