Consequences for use of hg for other applications than SCM was Re: German umlauts in file names
Adrian Buehlmann
adrian at cadifra.com
Thu Jun 26 13:20:29 CDT 2008
On 26.06.2008 11:08, Adrian Buehlmann wrote:
> On 21.06.2008 02:22, Matt Mackall wrote:
> A somewhat related question:
>
> I have a problem with encoding of filenames in my experimental
> long path patch [1].
>
> What is the correct way to convert from / to unicode filename strings
> on Windows, if I want to mimic the current behavior of Mercurial on
> Windows (which is surely needed for compatibility with current repos)?
Ok. I came up with the modification pasted at the end [3]. Most likely very
inefficient but it seems to do the same encoding/decoding so far as current
Mercurial:
> dir
Volume in drive W is Sys
Volume Serial Number is 8017-C29E
Directory of W:\tmp\aa
26.06.2008 20:01 <DIR> .
26.06.2008 20:01 <DIR> ..
26.06.2008 19:44 <DIR> .hg
26.06.2008 20:01 12 äöü.txt
1 File(s) 12 bytes
3 Dir(s) 30'946'508'800 bytes free
> hg sta
? Σ÷ⁿ.txt
> hgt sta
--- running hg from W:\hg-longpath
? Σ÷ⁿ.txt
[3]:
diff --git a/mercurial/osutil.py b/mercurial/osutil.py
--- a/mercurial/osutil.py
+++ b/mercurial/osutil.py
@@ -12,27 +12,29 @@
def listdir(path, stat=False):
'''listdir(path, stat=False) -> list_of_tuples
Return a sorted list containing information about the entries
in the directory.
If stat is True, each element is a 3-tuple:
(name, type, stat object)
Otherwise, each element is a 2-tuple:
(name, type)
'''
result = []
prefix = path + os.sep
names = os.listdir(util.longpath(path)) # returns unicode strings on Windows
names.sort()
for fn in names:
- fn = fn.encode()
+ def shrink(unicodestring):
+ return ''.join([chr(ord(c)) for c in unicodestring])
+ fn = shrink(fn)
st = os.lstat(util.longpath(prefix + fn))
if stat:
result.append((fn, _mode_to_kind(st.st_mode), st))
else:
result.append((fn, _mode_to_kind(st.st_mode)))
return result
diff --git a/mercurial/util.py b/mercurial/util.py
--- a/mercurial/util.py
+++ b/mercurial/util.py
@@ -1109,41 +1109,43 @@
msvcrt.setmode(fd.fileno(), os.O_BINARY)
def pconvert(path):
return '/'.join(splitpath(path))
def localpath(path):
return path.replace('/', '\\')
_longpathprefix = "\\\\?\\"
def longpath(path):
'''convert path to a Windows long path
needed to call Windows api with paths longer than 260'''
if path.startswith(_longpathprefix):
res = path
else:
path = path.replace('/', '\\').replace('\\.\\', '\\')
if path[-1] == '.':
path = path[:-1]
if not os.path.isabs(path):
path = os.path.abspath(path)
- res = unicode(_longpathprefix + path)
+ def expand(s):
+ return u''.join([unichr(ord(c)) for c in s])
+ res = expand(_longpathprefix + path)
return res
def normpath(path):
return pconvert(os.path.normpath(path))
makelock = _makelock_file
readlock = _readlock_file
def samestat(s1, s2):
return False
# A sequence of backslashes is special iff it precedes a double quote:
# - if there's an even number of backslashes, the double quote is not
# quoted (i.e. it ends the quoted region)
# - if there's an odd number of backslashes, the double quote is quoted
# - in both cases, every pair of backslashes is unquoted into a single
# backslash
# (See http://msdn2.microsoft.com/en-us/library/a1y7w461.aspx )
# So, to quote a string, we must surround it in double quotes, double
# the number of backslashes that preceed double quotes and add another
More information about the Mercurial
mailing list