Memory Leak in ParallelDeflateOutputStream??

Jan 6, 2010 at 4:25 AM

I have a project where I am zipping lots of files continuously. However quite quickly the program starts using alot of memory (got up to 600mb after a couple of mins. The larger the files that it zips, the more quickly the memory is being used up. I am watching the memory use through Windows Task Manager on XP. I am using version 1.9.0.31of the library.

I am using the following code:

Using zip As ZipFile = New ZipFile
  zip.MaxOutputSegmentSize = 20 * 1024 * 1024   '' 20mb
  zip.Encryption = EncryptionAlgorithm.None
  zip.CompressionLevel = Ionic.Zlib.CompressionLevel.BestCompression
  zip.AddFile(locationOfFileToZip)
  zip.Save(fileLocation)
  zip.Dispose()
End Using

Not sure if i am missing something, or need to clear anything out or do any rubbish collecting ??

Thanks All !

 

 

Coordinator
Jan 6, 2010 at 9:20 AM
Edited Jan 6, 2010 at 7:38 PM

You don't need to do any rubbish collecting.

You also don't need to call zip.Dispose() if you are using a Using clause.

Does the memory usage never stop increasing? 

If you zip large files, DotNetZip will allocate large buffers.   This is normal.  Often the .NET garbage collector will not GC the unused buffers until there is memory pressure.  In some cases people incorrectly conclude this indicates a memory leak.  Large memory usage, even 600mb, does not necessarily indicate a memory leak. 

Evidence of a memory leak is consistent memory growth, with every iteration.  Is that what you are seeing?  How many bytes does the memory usage grow, for each iteration? 

 

Jan 20, 2010 at 5:00 AM

I am seeing the same thing.  I have the version 1.9.1.1.  It appears that there is something in the Save() that is increasing my memory roughly the same size as the zip file that is written to the disk.  I have tried to call dispose as well as calling the GC but nothing appears to shrink remove the memory it grows every time I save a zip file.  I looked at the ZipFile source, however, I was not able to find the issue.  Like JBGillet said, it becomes a problem when you create large amounts of zip files.  I guess you start to notice the memory usage. Here is a sample code to reproduce this issue.

 

        private void makeZipFile(string zipFile, string[] filePaths, string comment)
        {

            using (ZipFile zip = new ZipFile())
            {
                zip.CompressionLevel = CompressionLevel.BestCompression;
                foreach (string fileName in filePaths){
                    zip.AddFile(fileName,"");
                }
                zip.Comment = comment;
                zip.Save(zipFile);
                zip.Dispose();
            }
        }

 

Coordinator
Jan 20, 2010 at 2:59 PM

First, your call to ZipFile.Dispose() is unnecessary, as it is called implicitly when the using scope exits.

Second, I did some analysis work.

I grabbed ionic.Zip.dll v1.9.1.1, and ran your code (removing the call to Dispose())  in a simple loop, of 1024 iterations.  In other words it produces a zip file 1024 times in a loop.  As input for the zip file, I used a set of 128 C# sourcecode files.   The output zipfile used the same name, for every iteration. After each set of 16 iterations, I did an analysis of the .NET heap and of the working set of the process.  What I found was steady-state.  There was no increase in the size of the working set of the process, or in the size of the  .NET heap, with each successive 16 iterations.

If there had been a memory leak, I would expect to see growth in the .NET heap, or in the private bytes used by the process.  But I saw no growth. 

To do this analysis, I used the WinDbg.exe tool, part of the Debugging Tools for Windows.  I followed the recommendations of the Microsoft .NET Servicing engineers, regarding how to analyze memory leaks in .NET processes.  Specifically I used the !eeheap -gc and !dumpheap -stat commands, available from sos.dll. 

An excerpt of what I found is here:

Cycle 1:

PDB symbol for mscorwks.dll not loaded
Number of GC Heaps: 1
generation 0 starts at 0x01c21018
generation 1 starts at 0x01c2100c
generation 2 starts at 0x01c21000
ephemeral segment allocation context: none
 segment    begin allocated     size
01c20000 01c21000  01cb7ff4 0x00096ff4(618484)
Large object heap starts at 0x02c21000
 segment    begin allocated     size
02c20000 02c21000  02c24260 0x00003260(12896)
Total Size   0x9a254(631380)
------------------------------
GC Heap Size   0x9a254(631380)



Cycle 2:

:000> !eeheap -gc
Number of GC Heaps: 1
generation 0 starts at 0x01d058a8
generation 1 starts at 0x01c8d4b8
generation 2 starts at 0x01c21000
ephemeral segment allocation context: none
 segment    begin allocated     size
01c20000 01c21000  01e5c988 0x0023b988(2341256)
Large object heap starts at 0x02c21000
 segment    begin allocated     size
02c20000 02c21000  02c25260 0x00004260(16992)
Total Size  0x23fbe8(2358248)
------------------------------
GC Heap Size  0x23fbe8(2358248)


Cycle 3:

Number of GC Heaps: 1
generation 0 starts at 0x01cd7d88
generation 1 starts at 0x01c5f998
generation 2 starts at 0x01c21000
ephemeral segment allocation context: none
 segment    begin allocated     size
01c20000 01c21000  01e2ee68 0x0020de68(2154088)
Large object heap starts at 0x02c21000
 segment    begin allocated     size
02c20000 02c21000  02c25260 0x00004260(16992)
Total Size  0x2120c8(2171080)
------------------------------
GC Heap Size  0x2120c8(2171080)


cycle 4:

Number of GC Heaps: 1
generation 0 starts at 0x01ceeb18
generation 1 starts at 0x01c76728
generation 2 starts at 0x01c21000
ephemeral segment allocation context: none
 segment    begin allocated     size
01c20000 01c21000  01e45bf8 0x00224bf8(2247672)
Large object heap starts at 0x02c21000
 segment    begin allocated     size
02c20000 02c21000  02c25260 0x00004260(16992)
Total Size  0x228e58(2264664)
------------------------------
GC Heap Size  0x228e58(2264664)

Cycle 5:

Number of GC Heaps: 1
generation 0 starts at 0x01d058a8
generation 1 starts at 0x01c8d4b8
generation 2 starts at 0x01c21000
ephemeral segment allocation context: none
 segment    begin allocated     size
01c20000 01c21000  01e5c988 0x0023b988(2341256)
Large object heap starts at 0x02c21000
 segment    begin allocated     size
02c20000 02c21000  02c25260 0x00004260(16992)
Total Size  0x23fbe8(2358248)
------------------------------
GC Heap Size  0x23fbe8(2358248)

As you can see there is no net growth in th GC heap size, as there would be if a memory leak were present in the DotNetZip code.  The !dumpheap output showed similarly steady results, indicating no net growth. 

I also used the Perfmon tool and monitored the "Private Bytes" counter for my process, as it ran.  It produced an absolutely flat line for memory size, which you can see here: 

If you could provide some additional information as to how you are determining that a leak is occurring, I'd be happy to look into this further.  Maybe you are using a larger number of files?  Maybe you are using a larger number of iterations, beyond 1024.  You said "a large number of zip files" but gave no specifics.    If you have a more complete code snip that shows exactly how you reproduce the leak, that would be helpful.  Maybe you are trying to compress binary files? or a mixture of files?  Or files above a certain size?  What tools, specifically, are you using that lead you to conclude that a memory leak is present?  If you can show me output from Windbg.exe, or Perfmon, that would really help.

It's possible there is no leak.  In some cases people observe a .NET process and see an initial growth in memory and conclude it is a leak.  But in a managed-memory environment, such as that provided by .NET, there will be a large growth in memory usage as each class is instantiated.  PRocesses "warm up".  The definitive evidence of a leak is when the memory usage grows steadily, with each iteration.  This is what I have not seen. 

 

Jan 20, 2010 at 5:14 PM

I noticed the same thing.

Some details first. I use the Ionic.Zlib namespace only to compress messages between systems (through hooking into WCF). I 'm compressing messages of various sizes between a few KBs and hundreds of MBs. Before I was using the DeflateStream, but then i switched only the compressing stream to ParallelDefalteOutputStream and i noticed the memory leak. To verify i switched back to DeflateStream and the leak was gone.

 

Coordinator
Jan 20, 2010 at 5:30 PM

ok, that's good information, worth exploring. I'll look into it and respond here.

Coordinator
Jan 20, 2010 at 6:10 PM
This discussion has been copied to a work item. Click here to go to the work item and continue the discussion.
Coordinator
Jan 20, 2010 at 6:41 PM

OK, there's definitely a leak.  The ParallelDeflateOutputStream is not being GC'd and all of the buffers maintained by that thing remain in memory.

 

Coordinator
Jan 20, 2010 at 9:27 PM

I updated the test case I had, so that it uses the ParallelDeflateOutputStream.  The test case is still very simple. It just creates a zip file, many times in a row.  Here's the perfmon trace for my updated test case, using the v1.9.1.1 bits:

You can see that clearly illustrates the leak.

Here's the trace for the test case, using the v1.9.1.2 bits, which includes the fix to workitem 10030:

Thanks for reporting this.

v1.9.1.2 will be available shortly.

Coordinator
Jan 21, 2010 at 5:11 AM

v1.9.1.2 is now available.  It includes the fix for this leak.

Thanks for reporting it.

 

Jan 21, 2010 at 5:21 PM
Thanks so much! I will check it out tonight.

On 1/21/10, Cheeso <notifications@codeplex.com> wrote:
> From: Cheeso
>
> v1.9.1.2 is now available. It includes the fix for this leak.Thanks for
> reporting it.
>
>

--
Sent from my mobile device
Coordinator
Jan 21, 2010 at 5:23 PM

doops, I'll be interested to hear of your results.

 

Jan 22, 2010 at 5:49 PM
The new release works like a charm.  There memory is staying just as I would expect it to.  Great work!  Thanks for getting that done for me.

On Thu, Jan 21, 2010 at 12:23 PM, Cheeso <notifications@codeplex.com> wrote:

From: Cheeso

doops, I'll be interested to hear of your results.

 

Read the full discussion online.

To add a post to this discussion, reply to this email (DotNetZip@discussions.codeplex.com)

To start a new discussion for this project, email DotNetZip@discussions.codeplex.com

You are receiving this email because you subscribed to this discussion on CodePlex. You can unsubscribe on CodePlex.com.

Please note: Images and attachments will be removed from emails. Any posts to this discussion will also be available online at CodePlex.com