This project is read-only.

Zip.Comment not accepting some characters

Jan 8, 2012 at 4:08 PM

Problem: My program collects data to store in the zip comments of the files it creates. A user pointed out an issue where if the Zip.Comment contains the ™ symbol, the comment will not be properly saved (the zip file is fully intact). The entire multi-line comment sometimes turns into an "*", I've also seen it "T", or "!T". I'm using v1.9.1.8, with 2010.


Reproduction: Create the zip file, and set Zip.Comment = "™", then Zip.Save. The comment should appear as "*" in the zip file.


This is a very easy issue for me to workaround, but I thought I would make you aware. Can you tell me what other characters might cause issues with the comment, so I can apply the workaround to my program?

Jan 10, 2012 at 11:03 PM

Check the documentation for ZipFile.Comment for information on this.

Jan 10, 2012 at 11:44 PM
Edited Jan 10, 2012 at 11:45 PM

Thanks for the response. I neglected to specify that I was using the following code:

Using Zip As ZipFile = New ZipFile(path)
    Zip.UseZip64WhenSaving = Zip64Option.AsNecessary
    Zip.AlternateEncodingUsage = ZipOption.AsNecessary
    Zip.AlternateEncoding = Encoding.Unicode

    Zip.Comment = string
End Using

I changed Zip.AlternateEncoding = Encoding.Unicode to Zip.AlternateEncoding = Encoding.Default and the comment appeared to be saved properly. But when I attempted to then read it, using the following code:

Using Zip As ZipFile = ZipFile.Read(FileName)
    Zip.AlternateEncodingUsage = ZipOption.Always
        Zip.AlternateEncoding = Encoding.Default

        string = Zip.Comment
End Using


I ended up with a strange ASCII character (instead of * or !T). Is there something else I am missing?

Jan 10, 2012 at 11:55 PM

Hmm, I don't know. I'm trying to think of the logic path in the code.

I think that the alternate encoding (Unicode, for example) is used if the filename or comment on any of the entries, necessitates it.  I don't believe using regular ascii filenames and comments on all the entries, with a unicode comment on the zipfile as a whole, will result in a unicode-encoded zipfile. I think.

I don't have the code in front of me so cannot review it just now to verify this theory.

If I am right, then the way to enforce Unicode is to use a unicode character in the comment on a zipfile entry (ZipEntry.Comment), as well as on the zipfile itself (ZipFile.Comment).

I'm not saying this is desirable behavior. I'm just guessing that this edge case isn't well handled, and I'm suggesting a possible workaround.


Jan 10, 2012 at 11:56 PM
This discussion has been copied to a work item. Click here to go to the work item and continue the discussion.