Creating zipfile creates temp file twice

Jul 26, 2011 at 1:22 PM

Hi,

I have the following code:

        static void Main(string[] args)
        {
            curPosX = Console.CursorLeft;
            curPosY = Console.CursorTop;
            using (ZipFile zipFile = new ZipFile())
            {
                zipFile.UseZip64WhenSaving = Zip64Option.Always;
                zipFile.CompressionLevel = Ionic.Zlib.CompressionLevel.BestSpeed;
                zipFile.SaveProgress += new EventHandler<SaveProgressEventArgs>(zipFile_SaveProgress);
                zipFile.AddFile(@"C:\TEMP\speedtest_500MB.bin", "backup");
                zipFile.Save(@"C:\TEMP\speedtest_500MB.zip");
            }
        }

        static void zipFile_SaveProgress(object sender, SaveProgressEventArgs e)
        {
            if (e.TotalBytesToTransfer != 0)
            {
                Console.SetCursorPosition(curPosX, curPosY);
                Console.WriteLine("Writing entry {0} of {1}, currently at {2}kb of {3}kb", e.EntriesSaved, e.EntriesTotal, e.BytesTransferred /1024, e.TotalBytesToTransfer / 1024);
            }
        }

This code takes a binary file of 500 MB in size and compresses it. But when the temporary file has been created for the first time, it immediately clears out the file again and starts all over again. Just after the second time the temporary file will be renamed to the speedtest_500MB.zip.

Is this normal behavior? Or am I doing something wrong?

Regards,

Marcel.

Coordinator
Jul 26, 2011 at 10:07 PM

yes, the behavior you describe is expected.  It's normal.  (probably)

There's a feature in the zip library that will store the file into the zip file "uncompressed", if "compression" increases the size of the file.  This feature was added a long time ago, to combat the anomalous behavior of the builtin System.IO.Compression.DeflateStream, which, in some cases can increase the size of a file 40% or more. This happens with previously compressed data.  Though DotNetZip no longer uses System.IO.Compression.DeflateStream - instead it uses its own managed ZLIB library for compression - and therefore the massive increases in size no longer are possible, the "check for increase" feature is still in there.   It handles even small increases in size.

Looking at your code I see that you are using "Bestspeed" for the compression level.  This can in some cases result in a slightly larger "compressed" file than the original "uncompressed" file.  It will happen only when the original file is not really "uncompressed" - it will happen when it is in some compressed format.  So the behavior you describe is normal if you are dealing with a file that is already compressed.

There's currently no way to turn off the "retry if the filesize increases" feature.

I think if you raise the compression level to something better, you may avoid the problem. Conversely, if you know the file will not compress, you can specify CompressionLevel = None and the file will be stored directly, no attempt will be made to compress it.

 

Jul 27, 2011 at 9:59 AM
Thanks!

Because I used a file with 500 MB of random bytes of data, there is nothing to compress, so indeed the size increases a little when you store it in a zipfile. I tried it again with a file that can be compressed and now it only does the compression once.

2011/7/27 Cheeso <notifications@codeplex.com>

From: Cheeso

yes, the behavior you describe is expected. It's normal. (probably)

There's a feature in the zip library that will store the file into the zip file "uncompressed", if "compression" increases the size of the file. This feature was added a long time ago, to combat the anomalous behavior of the builtin System.IO.Compression.DeflateStream, which, in some cases can increase the size of a file 40% or more. This happens with previously compressed data. Though DotNetZip no longer uses System.IO.Compression.DeflateStream - instead it uses its own managed ZLIB library for compression - and therefore the massive increases in size no longer are possible, the "check for increase" feature is still in there. It handles even small increases in size.

Looking at your code I see that you are using "Bestspeed" for the compression level. This can in some cases result in a slightly larger "compressed" file than the original "uncompressed" file. It will happen only when the original file is not really "uncompressed" - it will happen when it is in some compressed format. So the behavior you describe is normal if you are dealing with a file that is already compressed.

There's currently no way to turn off the "retry if the filesize increases" feature.

I think if you raise the compression level to something better, you may avoid the problem. Conversely, if you know the file will not compress, you can specify CompressionLevel = None and the file will be stored directly, no attempt will be made to compress it.

Read the full discussion online.

To add a post to this discussion, reply to this email (DotNetZip@discussions.codeplex.com)

To start a new discussion for this project, email DotNetZip@discussions.codeplex.com

You are receiving this email because you subscribed to this discussion on CodePlex. You can unsubscribe on CodePlex.com.

Please note: Images and attachments will be removed from emails. Any posts to this discussion will also be available online at CodePlex.com


Coordinator
Jul 27, 2011 at 12:36 PM

Ahh, right, random data will also be mostly incompressible.

Compression - all algorithms - work by replacing repeated patterns with smaller "encodings" .  Random data won't exhibit those patterns and therefore won't be highly compressible.