Streamed creation of large zip file with low memory cost

Nov 5, 2009 at 2:41 PM

Hi!

Im developing an application that  exports a lot of data from a database server and write this into a binary file added into a zip file.

I will open a cursor in database, read one record at a time and add the row into a filestream inside a zip file.

Im not able to read every record because the large amount of data will not fit inside the memory. The simliest way may be to create a file and add the file to the zip later by using ZipFile class but i would like to skip tha roundtrip to the disk.

Is this possible? If so, how?

// Stefan

 

Coordinator
Nov 5, 2009 at 3:10 PM

Yes, there are several ways to do this.

One way is to use the ZipFile class, and the AddEntry method on it, specifying a WriteDelegate.  The writeDelegate is invoked when you call ZipFile.Save(), and within the WriteDelegate, you can write whatever you like into the stream given to you by the DotNetZip library. The data that you write into the stream is then stored as an entry in the zip file.  Example code:

var c1= new System.Data.SqlClient.SqlConnection(connstring1);
var da = new System.Data.SqlClient.SqlDataAdapter()
    {
        SelectCommand=  new System.Data.SqlClient.SqlCommand(strSelect, c1)
    };

DataSet ds1 = new DataSet();
da.Fill(ds1, "Invoices");

using(Ionic.Zip.ZipFile zip = new Ionic.Zip.ZipFile())
{
    zip.AddEntry(zipEntryName, (name,stream) => ds1.WriteXml(stream) );
    zip.Save(zipFileName);
}

If you have a readable stream that supplies the data for the zipentry, you can use the AddEntry overload that accepts a stream.   In this case, when you call ZipFile.Save(), the stream you passed will be read to the end, and the data placed into the zip file.  If you'd like to defer the opening of the stream until ZipFile.Save is called, then there is another overload of AddEntry for that.  In this one, you pass a delegate to open the stream, and another delegate to close the stream. DotNetZip will invoke your delegates on a just-in-time basis.  example:

using(Ionic.Zip.ZipFile zip = new Ionic.Zip.ZipFile())
{
    zip.AddEntry(zipEntryName,
                 (name) =>  File.Open(filename, FileMode.Open, FileAccess.Read, FileShare.ReadWrite ),
                 (name, stream) =>  stream.Close()
                 );

    zip.Save(zipFileName);
}

The final option I think, is the ZipOutputStream class. This is a different programming model, one where you have to do a little more work. In this one the zip archive is modeled as a stream, and you can just do the read from one stream (presumably your database) and write into the ZipOutputStream.

private void Zipup()
{
    if (filesToZip.Count == 0)
    {
        System.Console.WriteLine("Nothing to do.");
        return;
    }

    using (var raw = File.Open(_outputFileName, FileMode.Create, FileAccess.ReadWrite ))
    {
        using (var output= new ZipOutputStream(raw))
        {
            output.Password = "VerySecret!";
            output.Encryption = EncryptionAlgorithm.WinZipAes256;

            foreach (string inputFileName in filesToZip)
            {
                System.Console.WriteLine("file: {0}", inputFileName);

                output.PutNextEntry(inputFileName); 
                using (var input = File.Open(inputFileName, FileMode.Open, FileAccess.Read, FileShare.Read | FileShare.Write ))
                {
                    byte[] buffer= new byte[2048];
                    int n;
                    while ((n= input.Read(buffer,0,buffer.Length)) > 0)
                    {
                        output.Write(buffer,0,n);
                    }
                }
            }
        }
    }
}

 

Nov 6, 2009 at 8:02 AM

Thank you for a very good detailed answer on an impressing good product. The last sample seems to fit my requirements exactly.

I Need to use DataReader because the mount of data. milions of rows. Read each record one at a time into a binary stream, add the stream into the zip stream. New entry for each table.

Im impressed! This cool solution solved my problem.

Time for donations!

// Stefan

 

Coordinator
Nov 6, 2009 at 8:47 AM
Edited Nov 6, 2009 at 8:52 AM

I'm gald it's helpful, Stefan.

Actually, though, from the sound of it you might have simpler code if you went with the first example, the WriteDelegate. 

public void SaveDataIntoZip()
{   
  using(Ionic.Zip.ZipFile zip = new Ionic.Zip.ZipFile())
  {
    foreach ( var table in TableList)
        zip.AddEntry(GetEntryNameForTable(table), MyWriteDelegate);
    zip.Save(zipFileName);
  }
}


private void MyWriteDelegate(string entryName, Stream output)
{
  DataReader reader= CreateAndExecuteReader(entryName);
  while(reader.Read())
  {
    // write into the zip entry
    output.Write(....);  
  }
}

In the first method you have a list of tables that you'd like to save. For each one you add an entry to the zip file. The second method is the delegate. It gets called once for each entry you added - once for each table to be saved. In the MyWriteDelegate code, you open the DataReader (or is it BinaryReader?) for that particular table, then write that data directly into the zip entry.

That write delegate gets invoked N times before the zip.Save() completes. 

For me this kind of code is easier to understand.