Performance gains with Multi-threaded compression

Coordinator
Nov 5, 2009 at 7:10 AM
Edited Aug 12, 2010 at 3:19 PM

I just posted DotNetZip v1.9.0.29, which includes an update to speed up compression. It uses multiple threads to perform the compression, resulting in a significant performance gain in some cases. 

The chart below shows the time in seconds, required to compress a directory that contained about 1600 files, totalling about 230mb - text, images, music, and other stuff.   I ran hundreds of trials to analyze the performance, and these are the results. What you see is 4 broad bands or ranges.  Starting from the top are all the trials corresponding to the ZipOutputStream class.  Beneath that band, showing significantly better performance is the band for ZipOutputStream using multiple threads. (Shorter bars means less time, better performance)  Then the band for the ZipFile class, and the ZipFile with multiple threads. All these tests were conducted on my laptop - A 3 year old Compaq nc8430 with Core Duo processor. 

As you can see the use of multiple threads on my laptop increased performance signficantly.  The average time to compress the directory without using multiple threads was about 40 seconds.  Using threads, that average went down to as low as 23 seconds, with a particular combination of buffer sizes.  The bottom line is, using multiple threads can significantly increase the speed of compression.

 

Don't forget - you get this performance for free.  There's no code change required. All you have to do is get v1.9.0.29.   The performance gain comes from an improvement I made within the library.  Your zip operations should get faster, automatically.  Also, the compression effectiveness does not change significantly.  It still compresses "about the same" as a regular deflate.  There may be a slight increase in size to go along with this increase in speed - but it should be in the range of 0.1% to 5%.  And, it's still completely compatible with all the other zip tools.  It still uses the DEFLATE algorithm.  It's just faster.

How did I do it?  I split up large files into chunks, and compress the chunks independently.  Each chunk is compressed on a background thread (from the .NET threadpool).  If you look at a CPU monitor on multi-cpu or multi-core computers when zipping with DotNetZip in versions prior to v1.9.0.29, typically you will see one processor totally maxed out, and the other processor or processors doing relatively nothing.  This is because there was just one thread doing all the work.  Now, with the multi-thread design, all of the processors or cores can be used to perform the compression. The 45% drop in compression time might even be larger on a 4-cpu machine, as many desktops are today.  On a 8-core machine, as many servers are, you might see a 70% drop in compression time. I don't know because I don't have an appropriate machine to test this on.

Just for fun, I also did some comparisons with SharpZipLib, which is an alternative zip library for .NET.  SharpZipLib is pretty fast - the FastZip interface was zipping the same directory in about 35 seconds.  DotNetZip without multiple threads does it in about 39 seconds.  But with the multiple threads, DotNetZip's best time got down to 21 seconds.  Pretty nice gain. 

Now, what about the variations within the bands?  What is that all about?  In DotNetZip, there are two classes of buffer uses internally - those used for regular IO (the IO buffer, used for reading and writing files or streams), and those used for the compressor (the codec buffer).   The former can be set with the BufferSize property on the ZipFile or ZipOutputStream; the latter can be set with the CodecBufferSize property. I wanted to find out if varying the sizes of those buffers would have a significant impact on compression speed.  It turns out that there is an effect, though it is much smaller than the effect associated with the use of multiple threads.    There was a variation of 10-20% between the best and worst combinations of buffer sizes.

For my test situation, the best times came with an IO buffer of 32k or 64k, and across a wide range of codec buffer sizes.

Here's a closer look at one of the bands. All of the bars in the following chart pertain to the ZipOutputStream class.

This one shows that using buffer sizes of 1mb on my machine, led to poor performance. Also, using very small buffers - 2k, led to poor performance. The sweet spot was an IO buffer size between 8k and 64k, and a codec buffer size between 4k and 128k.  You can see some hysteresis in the chart - that's probably due to the fact that this was not a dedicated benchmark machine.  It was my laptop, and I was using it while the tests were running. Also I think I got a virus scan to happen at some point in there, too.  So that explains the spikes in compression time that is evident there.  I ran over 800 trials over the course of a day.


These results will vary depending on the kinds of files you compress, the amount of memory on your system, the speed of your I/O subsystem, the speed of memory access, and the compression level you use. (all these results used the default compression level.  In a few tests I found smaller performance gains using multiple threads, when using lower compression levels).  I only publish these charts to show you the relative importance of the buffer size versus the multi-thread approach.

There is a limitation - the perf gains in compression are nice, but when you also encrypt your zip entries, you get the "old' performance.  This is because encryption is a CPU-intensive task as well, and I haven't yet updated the library to use multiple threads when encrypting.  As a result, everything slows down to the slowest part in the system, which is encryption.  In the future I hope to be able to speed that up, as well. 

Any questions let me know.

 

Aug 10, 2010 at 7:48 PM

Cheeso,

Thanks for posting this.

(1) When I initialize a ZipFile instance the default for CodecBufferSize is 0. Is this intended?

(2) Is CodecBufferSize a buffer for the uncompressed data that is going to be compressed? I.e. If size = 2048, then compression does not take place until 2048 of data is written

Coordinator
Aug 12, 2010 at 3:30 PM

The 0 for CodecBufferSize is a marker for "use the default value".  I don't believe the actual default value is currently published or documented, but I may be wrong about that. 

The way the codec buffer works: using the IO buffer, the library tries to read N bytes from the input file or stream; depending on the IO conditions, that read may be fully satisfied, partially satisfied, or not satisfied at all (zero bytes read).  If the read is fully or partially satisfied, then DotNetZip will do compression on that portion of data.  If zero bytes are read, that indicates EOF, and DotNetZip concludes that the file or stream has been read fully, and stops. 

DotNetZip then does chunkwise compression of the data in the IO buffer, in this way: it copies a chunk into the codec buffer, compresses it, then copies it to the output stream.  Therefore if your codec buffer is 2048 bytes, DotNetZip will try to compress 2048 bytes at a time.  

So when the stream is larger than 2048 bytes, if your codec buffer is 2048 bytes, then YES, compression does not take place until you read/write 2048 bytes.  If the stream is smaller than 2048 bytes, then compression occurs at input EOF.  

I caution you to not draw performance conclusions from the description of how things work.  The  behavior of a computer program that uses DotNetZip depends on many factors.  Really the only way to optimize performance of such a complex system is to try and test it. With the various settings, DotNetZip gives you the chance to optimize things as appropriate for your usage. 

Good luck.