Win32text extension

This extension is currently being distributed along with Mercurial.

Author: Bryan O'Sullivan

Deprecation: The win32text extension requires each user to configure the extension again and again for each clone since the configuration is not copied when cloning.

We have therefore made the EolExtension as an alternative. The EolExtension uses a version controlled file for its configuration and each clone will therefore use the right settings from the start.

This extension may be removed in a future release of Mercurial.

To disable deprecation warnings from this extension (until you get get around to replacing win32text with eol), add these two lines to your configuration file:

[win32text]
warn = False

Overview

Core Mercurial tracks but never modifies file content, and it is thus binary safe. The different line ending conventions traditionally used in unix and windows will thus be maintained with core Mercurial. That can give annoyances and problems in mixed environments, so traditionally unix2dos and dos2unix scripts are used to manage the mess. The win32text Mercurial extension addresses this problem.

Method

The win32text extension assumes and maintains that the repository is in unix-style, and that text files in the repository thus uses \n for newline. The extension is only used on windows to convert text files, thus the name.

When Mercurial reads text file content from the repository then the extension can filter and decode unix-style \n to windows-style \r\n - and it will warn if the repository file already has windows-style newlines. And when Mercurial commits text files to the repository then they will first be decoded from windows-style \r\n to unix-style \n.

The remaining problem is to decide which files are text files; Mercurial intentionally doesn't track that kind of meta-data.

The win32text filters are applied to files matching a pattern, and by carefully creating patterns the filters can be applied to exactly all text files in a repository. The filter specification can however not be managed by the repository, so creating complex specifications for distributed projects isn't recommended. But the filters come in two variants: A "dumb" filter which modifies anything looking like line endings in all files it is applied to, and a "clever" filter which only works on files NOT containing zero bytes. The simple heuristics used by the "clever" filter will fail in some cases, but it often works so well that it can just be applied to all files.

Note: do not change the encode or decode filter settings while you have files checked out - use hg update null to remove any working files first.

The extension also provides a hook which can be used to prevent the introduction of text files containing CRLF. This hook is normally used to enforce unix-style line endings in shared repositories, by rejecting commits coming from users who have not enabled the filters.

Patch operations

Operations that apply patches, e.g. hg import and hg qpush, do not honor the win32text filters; a different method is required to make these work. Mercurial can be told to ignore line endings when patching using the patch.eol configuration option; see below for an example.

Without this option you will see a lot of error messages like this:

patching file path/to/file
Hunk #1 FAILED at 0
1 out of 1 hunks FAILED -- saving rejects to file path/to/file.rej

Configuration

Enable the extension in the configuration file (hgrc) and specify which filters should be used where, for example:

[extensions]
hgext.win32text=

[encode]
# Encode files that don't contain NUL characters.

** = cleverencode:

# Alternatively, you can explicitly specify each file extension that
# you want encoded (any you omit will be left untouched), like this:

# **.txt = dumbencode:


[decode]
# Decode files that don't contain NUL characters.

** = cleverdecode:

# Alternatively, you can explicitly specify each file extension that
# you want decoded (any you omit will be left untouched), like this:

# **.txt = dumbdecode:

# The following lines cause patch operations to ignore eol types in
# the input and patch files, and to generate CRLF line endings in the
# output.

[patch]
eol = crlf

If you apply these settings globally but wish to override them for a specific repository, use the "!" syntax in the repository hgrc file to remove the filters.

[extensions]
# Disable the extension
hgext.win32text= !

[encode]
# Disable the encoding filter
** = !

[decode]
# Disable the decoding filter
** = !

To enable the hook, add these lines

[hooks]
# Reject commits which would introduce windows-style text" files

pretxncommit.crlf = python:hgext.win32text.forbidcrlf

Usage

This extension doesn't require user interaction to work.


CategoryWindows CategoryBundledExtension

Win32TextExtension (last edited 2010-10-25 18:49:17 by SteveBorho)