Lists

Open Menu Close Menu

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: using FSevents for backup - is it reliable enough?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: using FSevents for backup - is it reliable enough?

Subject: Re: using FSevents for backup - is it reliable enough?
From: James Bucanek <email@hidden>
Date: Fri, 21 Jan 2011 08:58:47 -0700

Stan Sieler <mailto:email@hidden> wrote (Thursday, January 20, 2011 5:34 PM -0800):

A note of explanation about my verbosity... wow, I found finally people who understand development and file-systems ... thanks! For nearly 20 years, I was used to being able to post highly technical questions (and answers for others) about HP's MPE/iX operating system ... it's good to find a mailing list with similar quality for the Mac! Thanks! (That also means you'll see me making rookie mistakes sometimes, since I'm new to the Mac as a developer :)
The advantage of "near" is that an app can walk up to a file and say "what
attributes do you have?"
Which is exactly what the CSBackupIsItemExcluded function does. Give it a
file and it will tell you what the file's backup exclusion
status is.
An API is nice to have, but it doesn't pertain to the philosophical discussion
of where the data should really be stored. (I.e., no matter how it's
implemented, the API can can be written and have identical semantics.)


I agree, which was my point. ;)

It really pertains to the philosophical point that one should not (generally) be concerned about the implementation details of an API.

In a "near" implementation, the API's run-time cost is cheap.  In a "far" one,
it's more expensive (at best, it has to search a file/tree/something, which
can often cost more CPU/elapsed than looking at a near structure).

Not necessarily true. Again, this is a implementation detail that one can't make assumptions about. What if the property were maintained by an in-memory hash table? The lookup cost could be orders of magnitude faster than any I/O operation, no matter how "close" to the target file system object.

Knowing what the attributes are means that there is some chance that they
can be carried over to a copy/backup of the file.


But since this an OS API, applications don't have to worry about it. Whether
a copy retains the attribute is up to the OS and the kind of attribute. A
backup file would never have this attribute, since a file with this
attribute shouldn't be backed up.


That's a hard one.  For some file attributes, the app dealing with the file
can make better judgements about whether or not to maintain, delete, or modify
an attribute.  For others, only the OS can.

I think we may be talking at cross-purposes. I'm referring specifically to the option in CSBackupSetItemExcluded that controls whether the exclusion is "by item" or "by path". In the case of the "by path" variant, the attribute is not associated with the physical file system object; it's associated with its path. So renaming the item would not preserve this property. If set to "by item", then renaming the object would preserve the property. My point being, this is the job of the OS to manage and maintain, it's not the responsibility of every application that renames an item to worry about.

E.g., imagine an attribute which
is a boolean: have_I_been_backed_up.  If I copy file FOO to file FUM, then I'd
probably want that attribute reset. On the other hand, if the file has a
boolean attribute: needs_virus_check, then copying the file shouldn't clear
the flag on the new file.  Thus, two different boolean attributes with two
different "desired" actions ... what's a cp program to do? :)

I think you might be over-thinking the problem a bit. In general, it's not the job of copy programs to interpret or worry about the extraneous metadata associated with a file. A good copy program should faithfully duplicate all extraneous metadata (data forks, extended attributes, ACLs, etc.), unless specifically instructed not to. An example would be to copy a file but not retain its ownership.

The "I've been backed-up" and "needs virus check" attributes are actually good examples. A cp program that copied these attributes to a new file would be doing its job. If the source needed virus scanning, the one would assume that the copy needs virus scanning to.

A backup program, however, might not want the copy of a file to be flagged as having already been backed up. In this case, attaching an "has been backed up" attribute is poor design on the backup application's part. The solution would be to store the information about which items have been backed up "far" away from the file system object.

See what's happened? The burden of whether to store information about the item "near" or "far" becomes a design decision based on the desired behavior of those attributes. Applications that want that information to follow the file simply attach it to the file. Solutions that don't find some other means of tracking them.

It's up to CSBackupSetItemExcluded to decide where the best place to keep that information is. It's up to copy programs to duplicate all locally attached metadata about the file. And everything just works.

The application doesn't have to know anything, except to call CSBackupIsItemExcluded.
For backups, yes.  But I was discussing the more global question of how to
store / query file attributes in general.


I wasn't, which is probably the source of some of the disagreement.

And, in the case of a "cp" program, if cp called that API and checked the
flag, what does it do with the information?


Nothing.

Call CSBackupSetItemExcluded on
the new file?  That may not be appropriate in some cases.

No. It would duplicate all of the metadata attached to the file in a generic fashion. Whether that information includes a "exclude from backup" bit is something CSBackupSetItemExcluded() has to worry about, not the cp program.

Not always ... I've already shown one example (has file been virus-checked)
where the use of an attribute is at least partially the responsibility of some
program outside the traditional definition of "the OS".

Yes, its the responsibility of the virus checker program, but certainly not backup or copy applications.

I think you're imagining a world where copy and backup applications need to know about the purpose and meaning of individual pieces of metadata, but that's an untenable solution. It the kind of design that results in DLL hell. The responsibility for maintaining and interpreting individual metadata records should begin, and end, with the application that uses it, with the understanding that applications like cp will faithfully copy them. -- James Bucanek

_______________________________________________
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden



References:  
  >Re: using FSevents for backup - is it reliable enough? (From: Stan Sieler <email@hidden>)




Prev by Date:
Re: using FSevents for backup - is it reliable enough?

Previous by thread:
Re: using FSevents for backup - is it reliable enough?

Next by thread:
Re: using FSevents for backup - is it reliable enough?

Index(es):

Date
Thread