Re: using FSevents for backup - is it reliable enough?
Re: using FSevents for backup - is it reliable enough?
- Subject: Re: using FSevents for backup - is it reliable enough?
- From: James Bucanek <email@hidden>
- Date: Fri, 21 Jan 2011 08:58:47 -0700
Stan Sieler <mailto:email@hidden> wrote (Thursday, January 20,
2011 5:34 PM -0800):
A note of explanation about my verbosity...
wow, I found finally people who understand development and file-systems ... thanks!
For nearly 20 years, I was used to being able to post highly
technical questions (and answers for others) about HP's MPE/iX
operating system ... it's good to find a mailing list with
similar quality for the Mac! Thanks!
(That also means you'll see me making rookie mistakes sometimes, since I'm
new to the Mac as a developer :)
The advantage of "near" is that an app can walk up to a file and say "what
attributes do you have?"
Which is exactly what the CSBackupIsItemExcluded function does. Give it a
file and it will tell you what the file's backup exclusion
status is.
An API is nice to have, but it doesn't pertain to the philosophical discussion
of where the data should really be stored. (I.e., no matter how it's
implemented, the API can can be written and have identical semantics.)
I agree, which was my point. ;)
It really pertains to the philosophical point that one should
not (generally) be concerned about the implementation details of
an API.
In a "near" implementation, the API's run-time cost is cheap. In a "far" one,
it's more expensive (at best, it has to search a file/tree/something, which
can often cost more CPU/elapsed than looking at a near structure).
Not necessarily true. Again, this is a implementation detail
that one can't make assumptions about. What if the property were
maintained by an in-memory hash table? The lookup cost could be
orders of magnitude faster than any I/O operation, no matter how
"close" to the target file system object.
Knowing what the attributes are means that there is some chance that they
can be carried over to a copy/backup of the file.
But since this an OS API, applications don't have to worry about it. Whether
a copy retains the attribute is up to the OS and the kind of attribute. A
backup file would never have this attribute, since a file with this
attribute shouldn't be backed up.
That's a hard one. For some file attributes, the app dealing with the file
can make better judgements about whether or not to maintain, delete, or modify
an attribute. For others, only the OS can.
I think we may be talking at cross-purposes. I'm referring
specifically to the option in CSBackupSetItemExcluded that
controls whether the exclusion is "by item" or "by path". In the
case of the "by path" variant, the attribute is not associated
with the physical file system object; it's associated with its
path. So renaming the item would not preserve this property. If
set to "by item", then renaming the object would preserve the
property. My point being, this is the job of the OS to manage
and maintain, it's not the responsibility of every application
that renames an item to worry about.
E.g., imagine an attribute which
is a boolean: have_I_been_backed_up. If I copy file FOO to file FUM, then I'd
probably want that attribute reset. On the other hand, if the file has a
boolean attribute: needs_virus_check, then copying the file shouldn't clear
the flag on the new file. Thus, two different boolean attributes with two
different "desired" actions ... what's a cp program to do? :)
I think you might be over-thinking the problem a bit. In
general, it's not the job of copy programs to interpret or worry
about the extraneous metadata associated with a file. A good
copy program should faithfully duplicate all extraneous metadata
(data forks, extended attributes, ACLs, etc.), unless
specifically instructed not to. An example would be to copy a
file but not retain its ownership.
The "I've been backed-up" and "needs virus check" attributes are
actually good examples. A cp program that copied these
attributes to a new file would be doing its job. If the source
needed virus scanning, the one would assume that the copy needs
virus scanning to.
A backup program, however, might not want the copy of a file to
be flagged as having already been backed up. In this case,
attaching an "has been backed up" attribute is poor design on
the backup application's part. The solution would be to store
the information about which items have been backed up "far" away
from the file system object.
See what's happened? The burden of whether to store information
about the item "near" or "far" becomes a design decision based
on the desired behavior of those attributes. Applications that
want that information to follow the file simply attach it to the
file. Solutions that don't find some other means of tracking them.
It's up to CSBackupSetItemExcluded to decide where the best
place to keep that information is. It's up to copy programs to
duplicate all locally attached metadata about the file. And
everything just works.
The application doesn't have to know anything, except to call CSBackupIsItemExcluded.
For backups, yes. But I was discussing the more global question of how to
store / query file attributes in general.
I wasn't, which is probably the source of some of the disagreement.
And, in the case of a "cp" program, if cp called that API and checked the
flag, what does it do with the information?
Nothing.
Call CSBackupSetItemExcluded on
the new file? That may not be appropriate in some cases.
No. It would duplicate all of the metadata attached to the file
in a generic fashion. Whether that information includes a
"exclude from backup" bit is something CSBackupSetItemExcluded()
has to worry about, not the cp program.
Not always ... I've already shown one example (has file been virus-checked)
where the use of an attribute is at least partially the responsibility of some
program outside the traditional definition of "the OS".
Yes, its the responsibility of the virus checker program, but
certainly not backup or copy applications.
I think you're imagining a world where copy and backup
applications need to know about the purpose and meaning of
individual pieces of metadata, but that's an untenable solution.
It the kind of design that results in DLL hell. The
responsibility for maintaining and interpreting individual
metadata records should begin, and end, with the application
that uses it, with the understanding that applications like cp
will faithfully copy them.
--
James Bucanek
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden