Error extracting >4GB file with WinZip

Jun 20, 2009 at 9:39 AM

Hi,

I have created a Zip file bigger than 4GB with DotNetZip 1.8.3.31.  If I try extracting it with WinZip, the first 4GB extracts OK but after that I get the following error:

Warning: skipping "<filename>". The compressed size stored in the local header for this file is not the same as the compressed size stored in the central header.

Is this something that can be fixed?  Thanks very much.

Coordinator
Jun 20, 2009 at 6:41 PM

Maybe.

Does it extract correctly with DotNetZip?   

What version of WinZip?  I am not sure when WinZip started supporting ZIP64.  You may want to check that.

 

Coordinator
Jun 20, 2009 at 9:37 PM

Any chance you can get me the file, so I can have a look and see what went sideways?

Jun 21, 2009 at 12:05 AM

Yes, it extracts OK with DotNetZip but I'm a bit concerned that there is something wrong with the file if WinZip can't extract it.  I'm using WinZip 12.0 which definitely supports Zip64.

It's going to be difficult for me to get the file to you due to the size but it seems to happen with any file I create over 4GB.

Thanks.

Coordinator
Jun 21, 2009 at 3:25 AM

I understand.  It *is* concerning.

I don't have WinZip 12.  If you can't get me the file, maybe I can get the trial version and test a zip file I produce here. 

 

Jun 22, 2009 at 9:38 AM

OK, here's a thought.  When I create the Zip file I set the UseZip64WhenSaving option to AsNecessary.  If I have a look at the entries in the Zip file they don't have bit 3 of the general purpose BitField set.  However, if I set UseZip64WhenSaving to Always instead, bit 3 does get set.

I noticed in the code a comment about WinZip 12 having trouble reading Zip64 files that don't have bit 3 set so I am wondering if this is the problem.  It appears that bit 3 only gets set if UseZip64WhenSaving is set to Always or if it's set to AsNecessary and the stream is not seekable.

Unfortunately I can't test it right now because I don't have access to WinZip 12 but I'll test it tonight and let you know.  If using Always does fix it then that is a good enough workaround for me.

Coordinator
Jun 22, 2009 at 4:48 PM

That's a good thought and worth testing.  I remember the situation with WinZip 12 and bit 3.  The ZIP64 spec seems pretty clear that ZIP64 and bit3 are independent, but WinZip has them linked. Not sure why. 

I don't have WinZip12; I never bought it.  So for the moment I don't have Unit Tests that verify that WinZip can read the zipfiles produced by DotNetZip.  This only matters in the edge cases, and Zip64 is still included in that. Maybe I should buy the program.

Getting back to the point:  After inspecting the code I think you are correct, that bit3 does not get set as it should, if zip64 was actually used.

 

Coordinator
Jun 22, 2009 at 4:49 PM
This discussion has been copied to a work item. Click here to go to the work item and continue the discussion.
Jun 22, 2009 at 6:12 PM

So, setting UseZip64WhenSaving to Always does work - I can extract the Zip file with WinZip.  However, I can't help feeling that there is still something going on because WinZip itself does not set bit 3 on ZIP64 entries.  So it seems a bit odd that to create a file fully compatible with WinZip, DotNetZip has to set bit 3.

Coordinator
Jun 22, 2009 at 6:30 PM

I agree that there is still a mystery there. I don't understand fully what is going on.  I haven't looked at the WinZip-generated ZIP64 archives in a while, but when I did I could not figure out why DotNetZip's archives could not be read.  I remember being puzzled by the whole thing.  Then I found the solution Perl's IO::Compress module took, (set bit 3) and just did it that way without further investigation.  I'll have to take a closer look. 

Regarding the workitem:  i just checked in a fix that sets bit 3 correctly, when AsNecessary is used.  I am building the binary release now.  v1.8.3.33 is the first binary release that will have this fix. 

 

 

Coordinator
Jun 22, 2009 at 6:31 PM
This discussion has been copied to a work item. Click here to go to the work item and continue the discussion.
Coordinator
Jun 22, 2009 at 8:18 PM

FYI: There are two workitems that came out of this:

  • 7920 - Create WinZip tests in the test suite
  • 7917 - bit 3 is not always set when it should be.

7917 is fixed in v1.8.3.33.

Jun 22, 2009 at 8:21 PM

I've been looking at the code and running some tests and I think I have got to the bottom of this.  It appears that WinZip does not like that the RelativeOffsetOfLocalHeader and starting disc number are being put in the ZIP64 Extra Field in the entry header.  These should only be in the central directory as they are only relevant there.  I changed the code so that the ZIP64 Extra Field in the local header is 16 bytes instead of 28 and contains just the uncompressed size and compressed size and it works perfectly - you do not need to set bit 3.

Let me know if you agree with my findings.

Coordinator
Jun 22, 2009 at 8:46 PM

Ooh, very good.  That is good to know. 

I don't recall why I put that information - the relative offset and the disk number - in the extra field. 

Thanks for the analysis.  I will modify the code and set up some new tests for this.

Coordinator
Jun 24, 2009 at 2:44 AM

Status update.  In my tests, when I try to use WinZip to unzip a ZIP64 arcdhive created by DotNetZip, with the new changes that do not set bit 3,  I am getting the original error you reported ("The compressed size stored in the local header for this file is not the same as the compressed size stored in the central header.").  This is happening only if I don't set bit 3. 

I am also sometimes getting CRC mismatch errors. ("The 32-bit CRC stored in the local header for this file is not the same as the 32-bit CRC stored in the central header"). 

More testing and investigation needed.

 

Coordinator
Jun 24, 2009 at 3:46 AM

There were a few bits out of place. everything is working now. Will upload a new version when tests complete. 

Jun 25, 2009 at 8:15 PM

I've just been trying out the new 1.8.3.35 version with Zip64 files and I've found a problem.  I am also using AES encryption and I can't extract any of the Zip64 entries in my Zip file (with WinZip - it seems to work OK with DotNetZip).

Looking at the code I think I can see what is happening.  For the local header you are now creating a Zip64 field of just 16 bytes (for the uncompressed size and compressed size).  However, in WriteFileData (around line 4201) where you have detected that Zip64 is needed you are attempting to write the uncompressed size, compressed size and relative offset to the Zip64 field.  But as you have only allocated 16 bytes in this field you are overwriting whatever comes next in the header - either the AES field or the NTFS times.

On a related note, in ProcessExtraField you always attempt to read the local offset from the Zip64 field.  In not sure what would happen if you are reading a local header where this value is not present rather than the central directory

Coordinator
Jun 26, 2009 at 12:46 AM

hmm, yes there were a few problems with v1.8.3.35.  The tests did not all pass. I put that up too fast.

Coordinator
Jun 26, 2009 at 1:04 AM
This discussion has been copied to a work item. Click here to go to the work item and continue the discussion.
Coordinator
Jun 26, 2009 at 1:16 AM

So the problem was that, using DotNetZip,  if you create a zip file with ZIP64 and AES, then you cannot extract entries with WinZip, is that right?

I want to test this to make sure I get it right.

Coordinator
Jun 26, 2009 at 5:59 AM

OK, v1.8.3.37 is up.  There are new tests for WinZip extraction of zip files, in numerous combinations (encrypted and not, zip64 and not, and others),    Everythng seems to work now.  Let me know how it goes!

 

Jun 26, 2009 at 8:46 PM

Hi.  That works now.  Thanks very much.

Coordinator
Jun 26, 2009 at 9:08 PM

Thanks for all the analysis and thanks for working through it with me.