This project is read-only.

Problem adding multiple file streams

Aug 18, 2008 at 6:50 PM
I am using your library in a web application and am having a problem when attempting to add multiple file streams as in the following example:

using (ZipFile zip = new ZipFile(context.Response.OutputStream))
    SoapFormatter soapFormatter = new SoapFormatter();

    foreach (string setName in GetAllDataSetNames())
        using (DataSet dataSet = GetDataSet(setName))
            using (MemoryStream memoryStream = new MemoryStream(2048))
                soapFormatter.Serialize(memoryStream, dataSet); //output to memory stream
                zip.AddFileStream(setName, "", memoryStream); //input into the zip archive

    zip.Save(); // <----------------------------------------- Throws ObjectDisposedException (I believe because each reference to memoryStream has been disposed.)

I tried moving the call to zip.Save() into the using block for the memoryStream, but the second call to zip.Save() then fails.

For a workaround I am going to try and keep a list of MemoryStream objects all of which are open until the zip.Save() call, then dispose each MemoryStream object in the list.

Am I missing something here?

Thanks for your library and your support!
Aug 19, 2008 at 12:53 AM
You are not missing anything.

The first case, where the call to ZipFile.Save() is made after the MemoryStream objects have been disposed, is expected to fail, for the reason you described.  The .Save() method is when the data is actually read from the input streams and written to the (output) archive, so you need to have the input streams available at the time of Save().

Before I talk about the second case, let me describe the zip format, and how Save() works: a zip archive is a series of streams of data, one per archived file, followed by a footer, which contains directory information - a list of all the files stored in the archive along with a bit of metadata for each file.  When you call Save(), all the streams for each entry in the zip are emitted, and then the footer is produced and written to the output stream.  The footer directory contains info for all the files that were in the archive that was read in (if you used the ZipFile() constructor that reads in a zip archive), plus all the files that were added with an AddFile / AddItem method call since instantiation.     

If you call Save() on a ZipFile that has the filesystem as the backing store, then the Zip library effectively READS the zip archive after having written it.  It doesn't really read-in all the data, but the effect is the same. The state of the ZipFile() instance is as if the file just written, had been freshly read.  You can then add new files to the ZipFile with AddFile(), or remove files with RemoveFile(), and call Save() again.  (and again and again).  At each Save(), all the file data is written and a new, consistent footer is generated.  The filesystem gets the latest zip file.  Everything is rosy.

BUT!  if the output for the ZipFile is a write-only stream (as it is in your case, the ASP.NET Response output stream), then the ZipFile cannot behave the same way.  Once Save() is called, then it is not really possible to call Save() again, with reasonable semantics or behavior.  When Save() completes, the library has written the footer information to the stream.  The zipfile content has now been completely written to the output.   If you call Save() again, the library cannot "unwrite" the footer information from the first Save().  The library cannot "reel in" the previously written zip data. 

Therefore in the 2nd case, where you moved the Save() into the using() clause for the MemoryStream - the first call succeeds, but subsequent calls will not. 

If you had used a zip archive on the disk as an output, then this usage pattern would succeed. 

Summing up:
  • it might be reasonable to say, "the library should fail more gracefully, or with a more intuitive error message, whenI try to call Save() repeatedly on an output stream".  I could look into fixing *that*.  Please create a workitem if you think this is appropriate.
  • it is certainly reasonable to say "this ought to be documented more clearly"  or "this ought to be documented (period)."  I will fix this.
  • your workaround, to keep all the streams available and call Save() once, is the only reasonable approach to take in the ASP.NET case where you are writing to the Response.Output stream.
  • A different option is to write the zip to a filesystem file, then when the last Save() is complete, stream that file to Response.Output.