Zip merge capability...

Jun 16, 2009 at 2:44 AM

I'm trying to find out if DotNetZip has any zip-merge ability.

The goal here is to start with two zipped files and merge them into a single zip containing both files.

Ideally this would work in a streaming format, would allow multiple successive file additions and at it's very best could pre-estimate the final file size.

Is any part of this available?  I'm trying to reduce system impact of an on-the-fly zip by pre-zipping all the files (reducing my storage cost too) and then just merging into a single common zip.

Worst case I can (sort-of) see how I might do this myself, but that would not be optimal.

For me the optimal would be as per the following pseudo-code...

using zipstream out = new zipstream(target);

{

    out.startstreamingout();

    foreach (zippedstream in AllTheFilesIWantToSend)

    {

        out.add(zippedstream)

    }

}

Anyway, hopefully that give people a clue as to what I have in mind.

Can DotNetZip do this?

Thanks,

 

 

Coordinator
Jun 16, 2009 at 10:08 AM

Nope, there's no merge capability.

A zipfile consists of a series of entries, followed by a directory of all the entries.  The entry within a zipfile has the file data, compressed and maybe encrypted, and then metadata, or data about the data- filename, timestamp, that kind of thing.  The entry data can be moved around intact.  The directory gives the offsets within the zip file of each entry.  The DotNetZip library provides an abstraction on the zip file, but doesn't allow you to move a ZipEnty between or among zipfiles. You never get the raw ZipEntry bytes; you get the file bytes, the uncompressed (cooked?) data. 

But what you are describing would not be difficult to build. 

You could use DotNetZip or whatever to create the initial zip files.  Then you could write some custom code to parse the zip directory, and based on that, do a simple prune and graft operation of all the raw entries.  You can assemble them into a custom zipfile and create a new zip directory based on the order and offset.   You already have the DotNetZip source, which will parse and create the central directory.

It may be possible to just build a straight extension of DotNetZip, to expose the raw byte stream for each ZipEntry. Rather than reading and decompressing, as happns with the ZipEntry.OpenReader(), you could just read the data directly.   And then a new method on the ZipFile that just adds a raw entry.

Before building it, it might be worthwhile to check your assumptions.  Seems like you have assumptions about how DotNetZip will perform, and you are designing the implementation around those assumptions. Maybe your assumptions are correct, but maybe not. Might be worth testing.

Coordinator
Jun 16, 2009 at 12:59 PM
This discussion has been copied to a work item. Click here to go to the work item and continue the discussion.