Re: Metadata support
Re: Metadata support
- Subject: Re: Metadata support
- From: Dan Shoop <email@hidden>
- Date: Mon, 26 Jun 2006 19:09:17 -0400
At 11:46 PM -0700 6/25/06, Jordan K. Hubbard wrote:
On Jun 25, 2006, at 7:29 PM, Dan Shoop wrote:
At 6:02 PM -0700 6/25/06, Jordan K. Hubbard wrote:
I think there's been a lot of confusion here about what
constitutes "file metadata".
Can we be clear that "creation date" is metadata? It's not file
data, but it's important to both the filesystem (or should be), the
user and applications that make use of them. It walks like a duck.
OK, creation date is admittedly special. Let's call it "filesystem
internal" metadata for the purpose of tracking when a file was
created. I mistakenly lumped it into my list of stat(2) data due
to a faulty memory of the st_ctimespec field in the stat structure,
but that doesn't change the facts. I believe the creation date is
not something you're intended to spoof - it's when the file was
created.
Correct me if I'm wrong but this was traditionally included in an
AppleDouble. However OS X doesn't read/write as much data in the
Apple Double for it's Finder Info/
If you create a backup file, that backup file will have its own creation date.
Well, that's wrong then. A Backup /must be/ identical to the source
in every possible respect otherwise it can not be used to properly
restore said file to it's original state. While one might argue that
cp or ditto aren't backup tools, tar clearly is.
If you restore from backup in such a way that the original file is
deleted and replaced (e.g. a new inode is allocated) then, by all
rights, the new file is not the same as the old file and should have
its (newer) creation date reflect this. FWIW, this behavior is not
unique to MacOSX.
Nor is the expected behavior not unique on other OSen.
Except that things like creation date aren't POSIX metadata. POSIX
tools, unless made aware of these through the likes of copyfile(),
aren't going to deal with creation dates. Hence cp, when copying a
file, fails to maintain this metadatum from the source to the
target.
cp copies all the relevant metadata, why don't we rephrase it that way? :)
So we'll except that, perhaps, cp's function in life is that it makes
whole new files, OK. Fine. Philosophy.
Odd then that Finder doesn't also follow this. If I copy a file using
the Finder it *maintains* the source files creation date.
The expectation is that a Finder copy operation and a [BSD] cp copy
operation should be identical.
Moreover however you might want to argue the above, backup and clones
aren't about being able to create new files but being able to
re-create files *exactly* as they originally were. To condemn such
backups to always being physical rather than logical is inane.
Foreign file systems aren't going to store them either. Which is
why Apple Double is supposed to store this in ._file. But it
appears that OS X doesn't.
This statement suggests some confusion about the purpose of the
AppleDouble file.
Is there some document supplanting the canonical
"AppleSingle/AppleDouble Formats for Foreign Files Developer's Note"?
To wit on page seven it clearly implies that we store creation dates:
"The File Dates Info entry (ID=8) consists of the file creation,
modification, backup
and access times".
Yet, IIRC, OS X seems to omit creation date its ._file's
Now you could define the philosophy that if you're copying a file
you're creating a new copy so that new copy should have it's own
creation date, but this flies against being able to backup a file
by copying it.
That is, indeed, how the philosophy is defined. :)
Then it Finder proves this philosophy is not followed.
If you want to backup files by copying them, you simply need to get
used to the notion that this creates new files. I enjoy backing up
files by copying them as much as the next guy (some people collect
bottlecaps, I copy files) and this doesn't bother me at all.
Hyperbole?
And then there's the ownership you bring up. What about it and
symlinks? This doesn't appear to suffer from any similar
philosophic quandary. Perhaps I'm missing it.
What about the symlinks? They're files too, of a sort (they used to
have far less metadata of their own in HFS, which I considered a
bug, but that's been getting better over time), and you can't tweak
their creation time either, so I can't see why they change the
discussion unless I, too, am missing something.
Talk ownership.
But the Finder Comments (which is what Spotlight Comments seem
descended from) were considers so before in classical Mac OSen, no?
I can't speak for the classical Mac OSen, but I can say that MacOSX
is descended from UNIX and hence is a child of different parents, so
to speak, and the fundamentals are going to reflect that.
Well Mac users see themselves as Mac users not unix users. If they
wanted FreeBSD they'd just install that instead.
Another instance of that would be spotlight's index data for the
file - would you have that associated with the file too?
I thought I answered that as unnecessary since it would just be re-created.
It was simply an example. One could contrive any number of
per-application scenarios where an application wanted to track
information about a file in a side-database and wouldn't necessarily
re-create that database on demand.
But we're talking pure files here, not what metadata some additional
application may decide to keep.
Let's stick to files and the filesystem and the tools and ssytem
routines that manipulate them directly.
Copy a file. Make a new target file given an existing source file.
Maintian all it's data and appropriate metadata so that to it's
users it appears the same. If, as a user, I rely on something as
simple as knowing when I created some file, say a photo I took, so
that I can sort them, that this information is mangled when I copy
it is annoying and unexpected.
Likewise there should be SOME tool that can clone files or at least
volumes at a logical level (other than a physical level like dd.)
I would argue that tar already does a pretty good job of this.
No it does not. It does a much poorer job that cp even. I'd argue tar
*should* do a pretty good job at this but it doesn't eve produce a
valid ._file.
You and I may disagree about the creation date semantics,
No we (you and I) don't disagree, the Finder disagrees with you.
That's pretty basic.
but tar will back up your files, and their POSIX metadata, and the
EA metadata (along with the ACLs), and even those pesky .DS_Store
files if you're backing up the whole volume (or, at least, whole
directory trees), so no, that's not an unreasonable thing to ask for
and yes, we provided it.
Backups *must* be able to restore a file *exactly*. That implies if I
tar a volume, erase it, and restore it that the files should be
exactly the same. This does not happen.
We even gave you rsync -E, for those cross-machine scenarios.
Let's not even go there, as rsync -E is a huge bag of worms.
I haven't seen anything in this discussion to suggest that those
mechanisms are inadequate for restoring data in a useful form,
simply some argument about the importance of creation time.
Should we enumerate a list for rysnc?
- creation date
- symlink ownership
- modifciation date
- chflag metadata
- ACLs AND resource forks !!!
- Aliases
Yes here's an example where almost anything does a better job than rsync.
--
-dhan
------------------------------------------------------------------------
Dan Shoop AIM: iWiring
Systems & Networks Architect http://www.ustsvs.com/
email@hidden http://www.iwiring.net/
1-714-363-1174
pgp key fingerprint: FAC0 9434 B5A5 24A8 D0AF 12B1 7840 3BE7 3736 DE0B
iWiring provides systems and networks support for Mac OS X, unix, and
Open Source application technologies at affordable rates.
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden