Folders and files with Á (A uppercase with accent) loose accent when compressed

Aug 12, 2009 at 2:57 PM


I observed that files and folders with Á loose the accent when compressed (I'm in spanish, in spanish it's usual an Á). This not happened with lowercase á or other uppercase vocals I tested (I didn't tested all).


Aug 12, 2009 at 6:08 PM


Well.  I am not a unicode expert, and I don't know the unicode requirements for encoding Á.  BUT... I know something about Unicode and zip files.

The default encoding for zip files is IBM437.  If you write a simple application that uses DotNetZip, and you don't specify an encoding to use, then the zip file will encode file and folder names with the IBM437 code page.   Now, IBM437 cannot encode all characters.  For those characters it cannot encode, there is a conversion done.   I don't know if Á can be encoded in IBM437.  If it cannot, then you will likely get a loss of information in the conversion, like Á gets converted to A.

To solve this problem, specify a value for ZipFile.UseUnicodeAsNecessary, or, specify ZipFile.ProvisionalAlternateEncoding, before adding any entries to a zip that you create.. Please read the documentation on those properties for a clear understanding of what they do, and the implications.

If you are already setting one of those properties and still you are losing information in the filenames (like Á gets converted to A), then I'll have to take a closer look at your code.