This project has moved. For the latest updates, please go here.
1

Resolved

encoding on linux platform

description

saving dxf file from a Linux platform with program solution compiled through xbuild tool, seems due to a mono issue, will produce a dxf file with UTF8 BOM encoding prefix ( 3 bytes EF BB BF ), this cause dxf file open to fail.

placing follow code before the return on the DxfDocument.Save(string file, bool isBinary) method allow to workaround this
var preamble = Encoding.Default.GetPreamble();

if (preamble.Length > 0)
{
    var bytes = File.ReadAllBytes(file);

    using (var fs = File.Create(file))
    {
        fs.Write(bytes, preamble.Length, bytes.Length - preamble.Length);
    }                    
}
essentially its strip the preamble ( if exists ).

comments

haplokuon wrote Jan 19 at 6:48 PM

With the NET Framework when a StreamWriter is created using the StreamWriter(Stream) constructor without passing an Encoding type it uses a UTF8 without BOM. Perhaps the Mono Framework does not follow the same rule and it defaults to an UTF with BOM, it will generate an error when the file is open in AutoCAD. From your explanation it looks like that this is the case, but in this Mono documentation it says that "This constructor creates a System.IO.StreamWriter with UTF-8 encoding without a Byte-Order Mark (BOM)" same as the NET.

Now if you don't mind make a couple of test:
  1. Does it happen only with dxf files saved as version AutoCAD2007 and upwards? Previous versions of dxf files uses an ANSI encoding with an specific codepage. The newer versions are the ones saved as UTF8.
  2. Does it happen only with text based dxfs or also with binary dxfs? Binary files should not be affected.
There is a way to force the use of an UTF8 without BOM, without having to manually stripping out the byte order mark. Goto the netDxf.IO.DxfWriter class and look for the method "private void Open(Stream stream, Encoding encoding)" in line 402.
private void Open(Stream stream, Encoding encoding)
{
    if (this.isBinary)
        this.chunk = new BinaryCodeValueWriter(encoding == null ? new BinaryWriter(stream) : new BinaryWriter(stream, encoding));
    else
        this.chunk = new TextCodeValueWriter(encoding == null ? new StreamWriter(stream) : new StreamWriter(stream, encoding));
}
and substitute it with:
private void Open(Stream stream, Encoding encoding)
{
    if (this.isBinary)
        this.chunk = new BinaryCodeValueWriter(encoding == null ? new BinaryWriter(stream) : new BinaryWriter(stream, encoding));
    else
        this.chunk = new TextCodeValueWriter(encoding == null ? new StreamWriter(stream, new UTF8Encoding(false)) : new StreamWriter(stream, encoding));
}
If this fix works for you with Mono, I guess you were right, and it should be consider an issue in the Mono Framework its documentation says one thing but does another, and since it is actually an implementation of Microsoft's .NET Framework it should follow the same rules.

Daniel

ldelana wrote Jan 19 at 7:45 PM

Hi Daniel,
thanks for you reply, I completely agreed with you,
thanks for workaround, I will do and report an isolated test code for further investigation just to exclude some eventual specifics in my platform, by the way I used latest mono nightly build.

ldelana wrote Feb 17 at 7:38 PM

Hi Daniel,

I did a pair of test from Linux platform and I confirm that save the dxf in ASCII don't have any issue about BOM characters.

While the problem still in binary, and gets solved applying your workaround, that I coded this way to works either platforms:
if (this.isBinary)
{
    if (Environment.OSVersion.Platform == PlatformID.Unix)
        this.chunk = new BinaryCodeValueWriter(encoding == null ? new BinaryWriter(stream) : new BinaryWriter(stream, new UTF8Encoding(false)));
    else
        this.chunk = new BinaryCodeValueWriter(encoding == null ? new BinaryWriter(stream) : new BinaryWriter(stream, encoding));
}
else
    this.chunk = new TextCodeValueWriter(encoding == null ? new StreamWriter(stream) : new StreamWriter(stream, encoding));

haplokuon wrote Feb 22 at 7:02 PM

Using a nightly build of a library, as you have confessed, always has its risks and most probably it contains bugs. Hopefully they will fix it, but there is no problem on adding the fix I proposed since it is exactly the same initializing a stream without encoding as doing it with UTF8Encoding(false) for both binary and text.

Daniel