Re: How to canonicalize the case of a filename or path?
Re: How to canonicalize the case of a filename or path?
- Subject: Re: How to canonicalize the case of a filename or path?
- From: Quinn <email@hidden>
- Date: Mon, 14 Apr 2008 10:44:05 +0100
At 22:48 -0700 13/4/08, Jens Alfke wrote:
Given a path on a case-insensitive filesystem, what's the best way
to canonicalize the path so that the case of each component - or at
least the final component - matches what's on disk? For instance,
such a function would convert the string "/system/LIBrarY" to
"/System/Library".
The closest thing I could find is realpath(3), but the man page says
nothing about changing case.
A complication is that the program that needs this feature [the
Mercurial version-control system] is written in Python and needs to
avoid platform-specific code. There is a standard Python
"os.path.realpath" function, but I tested it in Python 2.5 on
Leopard and it does not alter the case of filenames. So that would
rule out using realpath(3), assuming that's what its Python namesake
calls into.
I don't think there's an answer that meets all of your requirements.
I expect that the most efficient way to do it would be to open the
file and then use the F_GETPATH <x-man-page://2/fcntl>. However,
that's not cross platform. Also, it won't work if you don't have
read access to the file.
Another easy non-cross platform solution is the
FSPathMakeRef/FSRefMakePath sequence. That will work even if you
don't have read access to the file.
Another option is to use getattrlist to build the path, as I
described in the recent "Can i rely on inode numbers?" thread.
The only cross platform alternative that I can think of is to iterate
the directory matching based on inode number. That is:
lstat the file system object
iterate the parent directory
if you find an item with the same inode number, use its name
This simplistic algorithm breaks down in the presence of hard links.
You can have two names in the directory both pointing to the same
file system object. Working out which of these names is the correct
one is tricky; any solution would be very dependent on the string
comparison algorithm required by the volume format. For example,
U+0041 U+0301 and U+00E1 are case equivalents on HFS Plus but not on
HFS (original).
$ cd /Volumes/HFS\ Plus/
$ python
[...]
import os
os.close(os.open(u"\u0041\u0301".encode("utf-8"), os.O_CREAT))
os.close(os.open(u"\u00E1".encode("utf-8"), os.O_CREAT))
os.listdir(".")
['.DS_Store', '.fseventsd', '.Trashes', 'A\xcc\x81']
^D
$ cd /Volumes/HFS
$ python
Python 2.5.1 (r251:54863, Jan 17 2008, 19:35:17)
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
import os
os.close(os.open(u"\u0041\u0301".encode("utf-8"), os.O_CREAT))
os.close(os.open(u"\u00E1".encode("utf-8"), os.O_CREAT))
os.listdir(".")
['.DS_Store', '.fseventsd', '.Trashes', 'a\xcc\x81', 'A\xcc\x81']
^D
S+E
--
Quinn "The Eskimo!" <http://www.apple.com/developer/>
Apple Developer Relations, Developer Technical Support, Core OS/Hardware
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden