Zlib: Oversubscribed dynamic bit lengths tree

Jul 24, 2009 at 2:33 AM

Hey,

I'm a little new to the zip process so bear with me.  I had some code that was not compatible with large .zip files so I was looking around for something to handle it.  Came across this and decided to try it.  Unlike other packages this one would open and start extracting the files so thought problem solved.  However, just before it completed, got an "Oversubscribed dynamic bit lengths tree" error. 

The .zip file was created by WinZip 12 and is slightly over 4.3 GB.  The file extracts fine with WinZip but using the DotNetZip library it throws the foremention error raised from the zlib library.  It always throws this error on the same file when going through it.  I have looked at the file details in WinZip and it has the Zip64 markings so it should be a valid file for this size.  However the file that is has issues with is the first file where the "relative offset  of local header" exceeds 0xFFFFFFFF (got this for the WinZip details of the file).  I tried skipping pass this file but every file after that throws the error as well. 

Is there some setting I'm missing in the DotNetZip for Zip64 (already tried setting Zip64Option.Always)?  Could it be the .zip file? Does anyone know the issue and/or solution to this error?  Any help would be greatly appreciated.  Thanks.

-Wayne

 

 

Coordinator
Jul 24, 2009 at 5:02 AM

ah,,, hm, I gotta say, I think you're in trouble.  "Oversubscribed dynamic bit lengths tree" is an error I cannot explain.  It happens in the ZLIB layer, which I did not write.  I also don't understand the error enough to explain it to you. 

If it happens with every entry in the ZIP file, then maybe it is

  • a corrupted zip file. Can you open it in WinZip12?  
  • an unsupported compression method.  Is it using DEFLATE64 ?   DEFLATE64 is not the same as ZIP64. DotNetZip does ZIP64, not DEFLATE64.

Is it just one file you are trying to decompress?  Then why not use the tool you used to compress it?  If it is more than one file, then why not use the same thing on both ends - either WinZip or DotNetZip.

ps: The Zip64Option applies only when saving a file.  You don't need to set it when reading a file.

Jul 24, 2009 at 6:37 AM

Thanks for the quick response.  How can you tell if it using Deflate64?  There is approximately 31,500 files in the .zip.  DotNetZip works for the first 31,200 files and then gets the error.  When looking at the details in WinZip, the exact file where it first has problems is the first file where the "relative offset of local header" field that WinZip outputs exceeds 0xFFFFFFFF.  All the rest of the files from that point on list the "relative offset of local header" as 0xFFFFFFFF as well.  I have no idea what the field means but guessing that it is related to the zlib error.

Unfortunately, we are not creating the .zip files, just receiving them.  In this case they gave us a .zip with a password so we had to extract it first with the password and then compress it back without a password so it would go through our system. 

Coordinator
Jul 24, 2009 at 11:03 AM

Ok, well then if you are just receiving them, why not use WinZip to read the zip file?  All the other tools seem to fail opening it.  So use the thing that created it, to open it.  You apparently have WinZip, because you are using it to examine the file.  I don't understand why you don't just extract it with WinZip.  Or ask your sender to re-pack that file differently.

Deflate64 is defined in the appnote from PKWare http://www.pkware.com/documents/casestudies/APPNOTE.TXT as compression level 9.  Looking at the code it's pretty clear that DotNetZip would throw a specific exception if there was an unsupported compression level.  You're not getting that exception so I don't think it's using Deflate64.

The "relative offset of local header" is a reference to the location in the zip file where the zipentry data begins.  For entries that appear prior to the 4.2gb mark in the zip file, the 32-bit quantity suffices.  For entries beyond that point, ROLH would be higher than 0xffffffff.  According to the zip64 extension, in that case the archiving tool should put a value of 0xffffffff in the place where ROLH is stored normally, and then store the actual offset elsewhere, in a 64-bit quantity.   So what you are seeing is, for the last few files in that archive, the offset within the zip archive itself is beyond 0xffffffff.

In looking at the tests for DotNetZip, they cover the case where WinZip12 unzips a ZIP64 archive that DotNetZip produces, but not the converse.  That test succeeds - WinZip can extract a ZIP64-archive larger than 4.2gb that is produced by DotNetZip.  I'll build another test for the other case, and we'll see if it works.

Coordinator
Jul 24, 2009 at 11:35 AM

Also, "oversusbcribed dynamic bit lengths tree" generally indicates corruption.  Have you got a secure hash on that file?   Something that would allow you to verify that the zip file had not been corrupted or damaged between the time it was produced and the time you got it?

If you want to learn more about the "oversubscribed dynamic bit lengths tree" error - you can read these resources:

 

Jul 24, 2009 at 8:59 PM

I think I have seen something similar to this before and I have a theory as to what causes it.  I think it is due to the way that DotNetZip parses the Zip64 extra field.

The case where I have seen it (which sounds similar to what the original poster here described) is where the Zip file itself is over 4GB but none of the individual files have an uncompressed or compressed size over 4GB.  In this case, when creating the Zip file, WinZip just puts the relative offset of local header in the Zip64 field.  The field is therefore just 8 bytes in length.  However, when reading the file with DotNetZip, this piece of code in ProcessExtraField appears to be assuming that if it expects to find the relative offset in the Zip64 field, then the uncompressed and compressed size will be there too.  If the length of the field is less than 24 it won't read the relative offset.

 

if (this._UncompressedSize == 0xFFFFFFFF && DataSize >= 8)
{
    this._UncompressedSize = BitConverter.ToInt64(Buffer, j);
    j += 8;
}
if (this._CompressedSize == 0xFFFFFFFF && DataSize >= 16)
{
    this._CompressedSize = BitConverter.ToInt64(Buffer, j);
    j += 8;
}
if (this._RelativeOffsetOfLocalHeader == 0xFFFFFFFF && DataSize >= 24)
{
    this._RelativeOffsetOfLocalHeader = BitConverter.ToInt64(Buffer, j);
    j += 8;
}

As far as I understand it, the Zip spec says that any of the uncompressed size, compressed size, relative offset and disk number can be present in the Zip64 field but that they will always appear in the same order.  So I think DotNetZip needs to cater for the case where just the relative offset is present.

My guess as to what is causing the error is that the relative offset gets incorrectly set to 0xFFFFFFFF and it tries to decompress the data at that point in the file.  This causes a problem with ZLib because it is not valid deflated data.

 

Jul 24, 2009 at 10:34 PM

I think you are on to something.  Looking at the .zip detail again in WinZip 12, the extact file where the relative offset reports 0xFFFFFFFF, there is a Zip64 tag on the file showing up.  The length of the file is only 8 bits but appears to hold the RelativeOffset.  

I copy in the .zip entries of files

31303 (last file before the relative offset exceeds 0xFFFFFFFF),

31304 (a directory entry and relative offset set to 0xFFFFFFFF)

31305 (a file entry with relative offset set to 0xFFFFFFFF)

Note: the filename were changed to due to the sensitive of the information 

Thanks for your help.

 

 

 

Central directory entry PK0102 (4+42): #31303
======================================
    part number in which file begins (0000):        1
    relative offset of local header:                4294430697 (0xfff7cfe9) bytes
    version made by operating system (00):          MS-DOS, OS/2, NT FAT
    version made by zip software (20):              2.0
    operat. system version needed to extract (00):  MS-DOS, OS/2, NT FAT
    unzip software version needed to extract (20):  2.0
    general purpose bit flag (0x0002) (bit 15..0):  0000.0000 0000.0010
      file security status  (bit 0):                not encrypted
      extended local header (bit 3):                no
    compression method (08):                        deflated
      compression sub-type (deflation):             maximum
    file last modified on (0x0000395b 0x0000baa7):  2008-10-27 23:21:14
    32-bit CRC value:                               0xaea49b0e
    compressed size:                                725291 bytes
    uncompressed size:                              1001275 bytes
    length of filename:                             104 characters
    length of extra field:                          36 bytes
    length of file comment:                         0 characters
    internal file attributes:                       0x0001
      apparent file type:                           text
    external file attributes:                       0x00000020
      non-MSDOS external file attributes:           0x000000
      MS-DOS file attributes (0x20):                arc
Current Location part 1 offset 4338824867
    filename:ups2002@hotmail.com\9\7e765e71-21a6-47d9-8501-789ae3aa3894\7e765e71-21a6-47d9-8501-789ae3aa3894_mime.txt
Current Location part 1 offset 4338824971
    extra field 0x000a (PKWARE Win32 Filetimes), 4 header and 32 data bytes:
    00 00 00 00 01 00 18 00 00 19 3c 9d b4 38 c9 01 ..........<..8..
    00 19 3c 9d b4 38 c9 01 00 19 3c 9d b4 38 c9 01 ..<..8....<..8..
    The Extended Timestamps are:
      Creation Date:       2008-10-27 23:21:14
      Last Modified Date:  2008-10-27 23:21:14
      Last Accessed Date:  2008-10-27 23:21:14
Current Location part 1 offset 4338825007
Central directory entry PK0102 (4+42): #31304
======================================
    part number in which file begins (0000):        1
    relative offset of local header:                4294967295 (0xffffffff) bytes
    version made by operating system (00):          MS-DOS, OS/2, NT FAT
    version made by zip software (45):              4.5
    operat. system version needed to extract (00):  MS-DOS, OS/2, NT FAT
    unzip software version needed to extract (45):  4.5
    general purpose bit flag (0x0000) (bit 15..0):  0000.0000 0000.0000
      file security status  (bit 0):                not encrypted
      extended local header (bit 3):                no
    compression method (00):                        none (stored)
    file last modified on (0x00003ae7 0x00005509):  2009-07-07 10:40:18
    32-bit CRC value:                               0x00000000
    compressed size:                                0 bytes
    uncompressed size:                              0 bytes
    length of filename:                             59 characters
    length of extra field:                          48 bytes
    length of file comment:                         0 characters
    internal file attributes:                       0x0000
      apparent file type:                           binary
    external file attributes:                       0x00000010
      non-MSDOS external file attributes:           0x000000
      MS-DOS file attributes (0x10):                dir
Current Location part 1 offset 4338825053
    filename:ups2002@hotmail.com\9\8105cdf2-8469-4add-a2a2-2f95b86110f5\
Current Location part 1 offset 4338825112
    extra field 0x000a (PKWARE Win32 Filetimes), 4 header and 32 data bytes:
    00 00 00 00 01 00 18 00 90 65 96 3a 19 ff c9 01 .........e.:....
    50 0f e1 71 19 ff c9 01 00 54 7f 81 42 39 c9 01 P..q.....T.B9..
    The Extended Timestamps are:
      Creation Date:       2008-10-28 16:16:56
      Last Modified Date:  2009-07-07 10:40:18
      Last Accessed Date:  2009-07-07 10:41:52
    extra field 0x0001 (ZIP64 Tag), 4 header and 8 data bytes:
    9a e1 02 00 01 00 00 00                         ........
    ZIP64 Tag Value(s):
      Value #1: 4295156122
Current Location part 1 offset 4338825160

Central directory entry PK0102 (4+42): #31303

======================================

    part number in which file begins (0000):        1

    relative offset of local header:                4294430697 (0xfff7cfe9) bytes

    version made by operating system (00):          MS-DOS, OS/2, NT FAT

    version made by zip software (20):              2.0

    operat. system version needed to extract (00):  MS-DOS, OS/2, NT FAT

    unzip software version needed to extract (20):  2.0

    general purpose bit flag (0x0002) (bit 15..0):  0000.0000 0000.0010

      file security status  (bit 0):                not encrypted

      extended local header (bit 3):                no

    compression method (08):                        deflated

      compression sub-type (deflation):             maximum

    file last modified on (0x0000395b 0x0000baa7):  2008-10-27 23:21:14

    32-bit CRC value:                               0xaea49b0e

    compressed size:                                725291 bytes

    uncompressed size:                              1001275 bytes

    length of filename:                             104 characters

    length of extra field:                          36 bytes

    length of file comment:                         0 characters

    internal file attributes:                       0x0001

      apparent file type:                           text

    external file attributes:                       0x00000020

      non-MSDOS external file attributes:           0x000000

      MS-DOS file attributes (0x20):                arc

Current Location part 1 offset 4338824867

    filename:<FILENAME>.txt

Current Location part 1 offset 4338824971

    extra field 0x000a (PKWARE Win32 Filetimes), 4 header and 32 data bytes:

    00 00 00 00 01 00 18 00 00 19 3c 9d b4 38 c9 01 ..........<..8..

    00 19 3c 9d b4 38 c9 01 00 19 3c 9d b4 38 c9 01 ..<..8....<..8..

    The Extended Timestamps are:

      Creation Date:       2008-10-27 23:21:14

      Last Modified Date:  2008-10-27 23:21:14

      Last Accessed Date:  2008-10-27 23:21:14

Current Location part 1 offset 4338825007

Central directory entry PK0102 (4+42): #31304

======================================

    part number in which file begins (0000):        1

    relative offset of local header:                4294967295 (0xffffffff) bytes

    version made by operating system (00):          MS-DOS, OS/2, NT FAT

    version made by zip software (45):              4.5

    operat. system version needed to extract (00):  MS-DOS, OS/2, NT FAT

    unzip software version needed to extract (45):  4.5

    general purpose bit flag (0x0000) (bit 15..0):  0000.0000 0000.0000

      file security status  (bit 0):                not encrypted

      extended local header (bit 3):                no

    compression method (00):                        none (stored)

    file last modified on (0x00003ae7 0x00005509):  2009-07-07 10:40:18

    32-bit CRC value:                               0x00000000

    compressed size:                                0 bytes

    uncompressed size:                              0 bytes

    length of filename:                             59 characters

    length of extra field:                          48 bytes

    length of file comment:                         0 characters

    internal file attributes:                       0x0000

      apparent file type:                           binary

    external file attributes:                       0x00000010

      non-MSDOS external file attributes:           0x000000

      MS-DOS file attributes (0x10):                dir

Current Location part 1 offset 4338825053

    filename:<DIRECTORY>

Current Location part 1 offset 4338825112

    extra field 0x000a (PKWARE Win32 Filetimes), 4 header and 32 data bytes:

    00 00 00 00 01 00 18 00 90 65 96 3a 19 ff c9 01 .........e.:....

    50 0f e1 71 19 ff c9 01 00 54 7f 81 42 39 c9 01 P..q.....T.B9..

    The Extended Timestamps are:

      Creation Date:       2008-10-28 16:16:56

      Last Modified Date:  2009-07-07 10:40:18

      Last Accessed Date:  2009-07-07 10:41:52

    extra field 0x0001 (ZIP64 Tag), 4 header and 8 data bytes:

    9a e1 02 00 01 00 00 00                         ........

    ZIP64 Tag Value(s):

      Value #1: 4295156122

Current Location part 1 offset 4338825160

Central directory entry PK0102 (4+42): #31305

======================================

    part number in which file begins (0000):        1

    relative offset of local header:                4294967295 (0xffffffff) bytes

    version made by operating system (00):          MS-DOS, OS/2, NT FAT

    version made by zip software (45):              4.5

    operat. system version needed to extract (00):  MS-DOS, OS/2, NT FAT

    unzip software version needed to extract (45):  4.5

    general purpose bit flag (0x0002) (bit 15..0):  0000.0000 0000.0010

      file security status  (bit 0):                not encrypted

      extended local header (bit 3):                no

    compression method (08):                        deflated

      compression sub-type (deflation):             maximum

    file last modified on (0x0000395b 0x0000baab):  2008-10-27 23:21:22

    32-bit CRC value:                               0xe7ac4931

    compressed size:                                367 bytes

    uncompressed size:                              635 bytes

    length of filename:                             99 characters

    length of extra field:                          48 bytes

    length of file comment:                         0 characters

    internal file attributes:                       0x0001

      apparent file type:                           text

    external file attributes:                       0x00000020

      non-MSDOS external file attributes:           0x000000

      MS-DOS file attributes (0x20):                arc

Current Location part 1 offset 4338825206

    filename:<FILENAME>.txt

Current Location part 1 offset 4338825305

    extra field 0x000a (PKWARE Win32 Filetimes), 4 header and 32 data bytes:

    00 00 00 00 01 00 18 00 00 cd 00 a2 b4 38 c9 01 .............8..

    00 cd 00 a2 b4 38 c9 01 00 cd 00 a2 b4 38 c9 01 .....8.......8..

    The Extended Timestamps are:

      Creation Date:       2008-10-27 23:21:22

      Last Modified Date:  2008-10-27 23:21:22

      Last Accessed Date:  2008-10-27 23:21:22

    extra field 0x0001 (ZIP64 Tag), 4 header and 8 data bytes:

    f3 e1 02 00 01 00 00 00                         ........

    ZIP64 Tag Value(s):

      Value #1: 4295156211

Current Location part 1 offset 4338825353

 

 

Coordinator
Jul 24, 2009 at 10:58 PM

RMLewin, good suggestion. 

..ProcessExtraField appears to be assuming that if it expects to find the relative offset in the Zip64 field, then the uncompressed and compressed size will be there too.  If the length of the field is less than 24 it won't read the relative offset..... As far as I understand it, the Zip spec says that any of the uncompressed size, compressed size, relative offset and disk number can be present in the Zip64 field but that they will always appear in the same order. So I think DotNetZip needs to cater for the case where just the relative offset is present.

You are correct, the ProcessExtraField assumes that both uncompressed and compressed sizes are in the extra field.  I assume RelativeOffset *may* be present.  Here's the part of the spec I am relying on, for that:

          Value      Size       Description
          -----      ----       -----------
  (ZIP64) 0x0001     2 bytes    Tag for this "extra" block type
          Size       2 bytes    Size of this "extra" block
          Original 
          Size       8 bytes    Original uncompressed file size
          Compressed
          Size       8 bytes    Size of compressed data
          Relative Header
          Offset     8 bytes    Offset of local header record
          Disk Start
          Number     4 bytes    Number of the disk on which
                                this file starts 

          This entry in the Local header must include BOTH original
          and compressed file size fields. If encrypting the 
          central directory and bit 13 of the general purpose bit
          flag is set indicating masking, the value stored in the
          Local Header for the original file size will be zero.

I have not yet examined a ZIP64 file generated by WinZip to see the ZIP64 extra field.

Coordinator
Jul 24, 2009 at 11:03 PM

I think the key phrase is "The [ZIP64] entry in the Local Header must include both original and compressed file size fields."

On the other hand, there is also an entry in the central directory record for the ZIP64 info.  I have designed the code with the idea that the entry in the central directory will also include both original and compressed file size fields, but this is not correct.

This is a bug.

Jul 24, 2009 at 11:03 PM

 

Changing the Datasize to just 8 in every condition should fix this.  I want to try this to test but am unable to compile the Zip Full DLL project stating error "Unable to copy from the "obj\Debug\Ionic.Zip.dll" to "bin\Debug\Ionic.Zip.dll""

if (this._UncompressedSize == 0xFFFFFFFF && DataSize >= 8)
{
    this._UncompressedSize = BitConverter.ToInt64(Buffer, j);
    j += 8;
}
if (this._CompressedSize == 0xFFFFFFFF && DataSize >= 8)
{
    this._CompressedSize = BitConverter.ToInt64(Buffer, j);
    j += 8;
}
if (this._RelativeOffsetOfLocalHeader == 0xFFFFFFFF && DataSize >= 8)
{
    this._RelativeOffsetOfLocalHeader = BitConverter.ToInt64(Buffer, j);
    j += 8;
}
Coordinator
Jul 24, 2009 at 11:15 PM
Edited Jul 24, 2009 at 11:30 PM

It's not as simple as that, I think.  But you're right that it's a pretty simple fix.  

The problem with your code is that it does not handle the error case where DataSize==8, and more than one of {UncompressedSize, CompressedSize, RelativeOFfset} is 0xFFFFFFFF.  In that case, DotNetZip must throw an error. 

I'm testing a fix now. But it will take me a while to create the 5gb zip file.

Jul 24, 2009 at 11:16 PM

This is the part of the spec that I was referring to:

"The order of the fields in the zip64 extended  information record is fixed, but the fields will only appear if the corresponding Local or Central directory record field is set to 0xFFFF or 0xFFFFFFFF."

Coordinator
Jul 24, 2009 at 11:35 PM

Yep, and I had that part covered  - "will only appear etc etc. " is covered by the test for 0xFFFFFFFF.  But the error was assuming that the statement that the original and compressed file sizes MUST be present applied to both the Local Header as well as the central directory header. It clearly applies to the local header. It apparently does not apply to the central directory.

Coordinator
Jul 25, 2009 at 4:53 AM

ok I haven't run the tests yet, having some trouble with disk space.

But there's a release v1.8.4.11 available, with the fix for the handling of the ZIP64 header.  Try it and see if it works for your file.  let me know.

Coordinator
Jul 26, 2009 at 5:51 AM

All the tests I ran succeeded with these changes.

Jul 27, 2009 at 5:47 PM

Thanks Cheeso.  That fixed the problem I was having.....   Its not easy, being cheesy....

Coordinator
Jul 27, 2009 at 6:43 PM

Thanks for reporting it, and thanks for working through it with me.

ps: there was a bug in my original fix, so you definitely want v1.8.4.12, not v1.8.4.11.