UTF-8 Byte order marks inserted by hg merge

Brian Wallis brian.wallis at infomedix.com.au
Tue Jul 1 06:14:52 CDT 2008


G'day Jim,
   As Ian Lewis commented, the BOMs seem to cause more problems than  
they solve.

In this particular case, the eclipse project classpath is written in  
an xml file named .classpath which has the xml UTF-8 declaration.  
Unfortunately eclipse seems unable to read its classpath back in if it  
has the BOM in it. I expect that whatever input stream/xml parser used  
in eclipse cannot handle the BOM. That might be wrong and a bug in  
eclipse but I have to live with that.

I really don't see what the addition of the BOM is going to do to help  
in this case, I already know it is UTF-8 by the declaration. It seems  
to be one of those things that is not backward compatible and that  
will break many existing tools. We develop on Windows (XP and Vista)  
Mac OSx and Linux (suse, redhat, ubuntu, etc..) and I need what is the  
repository to work on all the above.

brian...

On 01/07/2008, at 5:45 PM, James Talbut wrote:

> Just out of interest why don't you want the BOM there?
> You may want the files to be 7 bit only, but you are telling your XML
> parser that they are UTF-8.
> If a non-ascii character is put in the file by accident a text editor
> that is unaware of the xml declaration could save it in an unparsable
> encoding.

Brian Wallis
InfoMedix
p: 3 8615 4553 | f: 3 8615 4501 | e: brian.wallis at infomedix.com.au
Level 5, 451 Little Bourke Street, Melbourne VIC 3000





More information about the Mercurial mailing list