Mailing Lists: Apple Mailing Lists
Image of Mac OS face in stamp
Re: Regular Expression problems with long-filenames
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Regular Expression problems with long-filenames



Am 03.03.2006 um 23:22 schrieb Bill Peters:

On Fri, 3 Mar 2006 21:58:26 +0100, Mike Fischer wrote
Bill,

I think you are assuming that two files with the same long name
would  produce the same mangled name. This is wrong AFAIK.

The mangled part is a hex representation of the HFS file id. So for
two files in two directories which have the same long name the
mangled part will be different because each file has its own unique-
per-volume file id.


Thanks, but this is actually the limitation that I am VERY well aware of. All anyone has to do is run
the backup or equal tool in the Shell to see that the mangled names do not match in supposedly
"identical" backup directories.

I see. Sorry for the noise then.


That's why I'm attempting to do a regular-expression match on the first 20 or so characters of the
filename and loop to find matches in the destination directory -- once I have the non-mangled part,
I do a


     for g in `files "{dstdir}{nonmangledbase}"<option-x>`

to match all of the potential duplicate candidates in the destination directory. (I have also found
cases where the hex portion of the mangled name differs in length between one drive and another.
Probably because of the inode range being different.) This part of the script appears to be working.

File ids are 32 bit integers. In theory they can have any hex value from 0 to FFFFFFFF (minus some reserved values probably). I'd guess that the mangling algorithm drops leading zeros. So a volume which has seen a lot of files will likely have longer numbers. (Assuming that the file in question was created recently.)



My problem is trying to find the right quoting and regular expression syntax to get the non-mangled
portion of the filename for use in the loop, above. And what I have so far,


      if "{f}" =~ /(?«18,26»[¬#])®2?#[0-9A-F]«2,10»(.?«1,6»)*®3/

just isn't working for filenames with quotes or extensions in the non-mangled part.

Sorry, I don't think I can help there. My MPW days are long gone. I've moved on. But the MPW Editor and IDE are still interesting enough for me to subscribe to this mailing list ;-)


One possible problem though: How do you tell a mangled long name from a name that happens to look like a mangled long name but which actually *is* the (long) filename? This case can happen when non-HFS+ aware tools copy a file for example.

How about writing an MPW tool to extract the long filename? You probably need to do some juggling with getting the right headers and libraries for accessing the HFS+ APIs but that should be doable using a separate piece of code loaded into the tool at runtime. You'd also need to invent a method for representing arbitrary Unicode characters as MPW is not Unicode aware to my knowledge.

Once you have the long filename (even if it uses some escaped characters) you can compare it to another one obtained by the same method.


Mike -- Mike Fischer Softwareentwicklung, EDV-Beratung Schulung, Vertrieb Web: <http://homepage.mac.com/mike_fischer/index.html> Note: I read this list in digest mode! Send me a private copy for faster responses.

_______________________________________________
Do not post admin requests to the list. They will be ignored.
MPW-Dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


References: 
 >Re: Regular Expression problems with long-filenames (From: Mike Fischer <email@hidden>)
 >Re: Regular Expression problems with long-filenames (From: "Bill Peters" <email@hidden>)



Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2011 Apple Inc. All rights reserved.