How to create Zip file in memory (without using temp folder)

Oct 15, 2008 at 12:21 AM
Hi,

Does DotNet Zip can do that ?

Thanks
Dominique
Coordinator
Oct 15, 2008 at 2:46 AM
use the ZipFile() constructor that accepts a System.IO.Stream .
Calling Save() on such a ZipFile will write directly to that stream.  No temp file is created in the filesystem.
Dec 29, 2008 at 5:44 AM
Edited Dec 29, 2008 at 1:45 PM
I'm having issues with this as well...

The problem is save can only be called once. I wish when you specified a memory stream that if you add any thing to the zip. It is automatically saved to the zip file, Or you could call save in increments via save() or write() on the ZipEntry or ZipFile...

My scenario is that I have around 2000 really small files that need to be zipped and there being saved at the very end and i have a task that needs to be executed right after the save and implementing an event... would slow down the process and be a pain.

Here is an example of what I would like to do (http://dotnettricks.com/blogs/craigbowesblog/archive/2006/09/02/94.aspx).. with dotnetzip

Another issue that I came across is you can't set the compression level (0-9).

Thanks
-Blake Niemyjski
Coordinator
Dec 29, 2008 at 1:43 PM
There is no compression level setting in this library.  You can choose compression or no compression.

I skimmed the article you mentioned but I don't really know what you mean by pointing me to it - it is an article that describes #ziplib, which is not this library.

There is an example of how to use DotNetZip in an ASP.NET page  - available at http://www.codeplex.com/DotNetZip/Wiki/View.aspx?title=ASPNET%20Example%201&referringTitle=Examples 

I don't know what you mean by "I have a task that needs to be executed right after the save and implementing an event" - Not sure what that has to do with creating a zip file.

Also I don't know what you mean by "save can only be called once."   If you have source code that shows a problem, maybe you could provide it to me.
Also describe what you expect to happen, and what is happening. 

I'll try to help.

There may be a problem in the library, but with what you have given me, I don't know how to help you.
Dec 29, 2008 at 1:55 PM
Edited Dec 29, 2008 at 6:58 PM
"There is no compression level setting in this library.  You can choose compression or no compression."
Yea I that's what I came across... I zip the files with #ziplib, windows or winrar and the file is 11mb compressed. When i use dotnetzip its 13mb.

"I skimmed the article you mentioned but I don't really know what you mean by pointing me to it - it is an article that describes #ziplib, which is not this library.

There is an example of how to use DotNetZip in an ASP.NET page  - available at http://www.codeplex.com/DotNetZip/Wiki/View.aspx?title=ASPNET%20Example%201&referringTitle=Examples "
Yes, I looked at all the examples, I'm using this code in msbuild.community.tasks (http://msbuildtasks.tigris.org/). I was trying to update it to use dotnetzip instead of sharpziplib (its using the code I linked you too).. Here is my code.

private static string CleanName( string name )

        {

            if ( name == null )

            {

                return string.Empty;

            }

 

            if ( Path.IsPathRooted( name ) )

            {

                // NOTE: for UNC names...  \\machine\share\zoom\beet.txt gives \zoom\beet.txt

                name = name.Substring( Path.GetPathRoot( name ).Length );

            }

 

            name = name.Replace( @"\", "/" );

 

            while ( ( name.Length > 0 ) && ( name[ 0 ] == '/' ) )

            {

                name = name.Remove( 0, 1 );

            }

            return name;

        }

 

        private bool ZipFiles()

        {

            try

            {

                Log.LogMessage( MSBuild.Community.Tasks.Properties.Resources.ZipCreating, _zipFileName );

 

                using ( ZipFile zip = new ZipFile( File.Create( _zipFileName ) ) )

                {

                    // add files to zip

                    foreach ( ITaskItem fileItem in _files )

                    {

                        string name = fileItem.ItemSpec;

                        FileInfo file = new FileInfo( name );

                        if ( !file.Exists )

                        {

                            // maybe a directory

                            DirectoryInfo directory = new DirectoryInfo( name );

                            if ( directory.Exists )

                            {

                                if ( !_flatten )

                                {

                                    bool directoryStartsWith = name.StartsWith( _workingDirectory, true, CultureInfo.InvariantCulture );

                                    if ( !string.IsNullOrEmpty( _workingDirectory ) && directoryStartsWith )

                                        name = name.Remove( 0, _workingDirectory.Length );

 

                                    if ( !string.IsNullOrEmpty( name ) )

                                    {

                                        name = CleanName( name );

                                        zip.AddDirectory( name + Path.DirectorySeparatorChar );

                                    }

                                }

                            }

                            else

                            {

                                Log.LogWarning( MSBuild.Community.Tasks.Properties.Resources.FileNotFound, file.FullName );

                            }

 

                            continue;

                        }

 

                        // clean up name

                        if ( _flatten )

                            name = file.Name;

                        else if ( !string.IsNullOrEmpty( _workingDirectory )

                                  && name.StartsWith( _workingDirectory, true, CultureInfo.InvariantCulture ) )

                            name = name.Remove( 0, _workingDirectory.Length );

 

                        zip.AddItem(fileItem.ItemSpec, CleanName(name));

 

                        Log.LogMessage( MSBuild.Community.Tasks.Properties.Resources.ZipAdded, name );

                    } // foreach file

 

                    zip.Save();

                }

 

                Log.LogMessage( MSBuild.Community.Tasks.Properties.Resources.ZipSuccessfully, _zipFileName );

                return true;

            }

            catch ( Exception exc )

            {

                Log.LogErrorFromException( exc );

                return false;

            }

        }


"I don't know what you mean by "I have a task that needs to be executed right after the save and implementing an event" - Not sure what that has to do with creating a zip file. "
The problem is that MSBuild is using the zip task to zip x number of files. Immediately after the zip task returns the next msbuild task is run which try's copying the zip file. But the zip file is still in use (saving).

Also I don't know what you mean by "save can only be called once."
When I put the save() inside of the foreach, the file size grew to around 400mb and it had only added the first entry added to the zip. It has to do with how the save method was implemented.
What I really need is to have the zip file automatitaclly add the new files to the archive. So when the next msbuild task runs the zip file will be up-to-date and not saving a 14mb zip..

Thanks
-Blake Niemyjski
Coordinator
Dec 29, 2008 at 6:29 PM
You raise a bunch of issues.
The file size is 13mb with DotNetZip and 11mb with other compression libraries.  This is a known issue with the effectiveness of System.IO.Compression, on which DotNetZip depends.  There is an open request for me to implement a different compression library.  It's a big piece of work.  I haven't committed to it.  yet?

Not sure how calling Save() in a loop would increase the file size to 400mb.   It does not behave differently if you call it multiple times, or once with multiple files.  Well it shouldn't.  What you describe is very odd. Can you produce a simple test case, without the msbuild task stuff, that demonstrates the behavior you see?   Something where, using the same set of entries, you call Save() once, versus multiple times in a loop, and the files are substantially different sizes. 

You also said the zip file is still in use after Save() completes.  Outside of the using() scope, this should not happen.  The zip file should be closed.  I would like to see a test case of this as well.  Source code that reproduces it.  

Hmm, but your code is wrong.  When you call the ZipFile() constructor, you should pass it the name of a valid zipfile that exists, OR the name of a file to be created.  You should not create an empty file and then pass that filename to the ZipFile() constructor.  [eg,   new ZipFile(File.Create(filename))]   I don't know what this will do;  I would expect it to throw an exception.  Check the docs for example code.  You should never have to call File.Create() on any file you pass to the ZipFile() constructor.  It could be that the File.Create() is perhaps opening the file, and that is what is causing it to remain open.   I suggest you correct your code and then see if the problem still occurs.

Last - you said you want the zip file to automatically add new files to the archive, and you don't want to write 14mb to the disk.  I don't know what that means - your code needs to add files to the archive, by calling into the library.  The zip file does not do this automatically.  You need to write code to do it.  And, when your code adds a file to the archive, the zipfile needs to be saved again.  If the file is 14mb, then it means writing 14mb to the disk.  This is the way it works.  You cannot add a file to the archive without writing the zip to the disk.  The two things go together. 
Dec 29, 2008 at 8:13 PM
Hello,

The new ZipFile(File.Create(filename))]  was causing the issue. You may want to look into this, as File.Create returns a FileStream and ZipFile() takes A Stream as an override (which accepts filestream... its base)..

Also ZipEntry entry = zip.addxxx... entry seems to be lazy loaded as none of the entry information like file size is filled out..

Thanks
-Blake Niemyjski
Coordinator
Dec 31, 2008 at 10:04 PM
Blake, I will look into the new ZipFile(File.Create(filename))]  thing.  I'll add a test for that, and be sure to throw a proper, clear exception.

On the lazy load of information - yes I think that is documented behavior.  The information is valid only after Save().   If it is not documented this way, please advise. (with specifics)
Coordinator
Jan 3, 2009 at 7:25 AM
Edited Jan 3, 2009 at 4:13 PM
On the ZipFile(File.Create(filename)) thing - you're right, Blake, that constructor makes no sense.   It is confusing and useless.

I removed all the constructors that accept Streams as parameters.   Now, if you want to save to a Stream, use the ZipFile.Save(Stream) method.
This change is in v1.7.1.11.
Jan 3, 2009 at 4:49 PM
Hello,

Thank you.

If you have a design/team meeting, I would like to be apart of it. I'm the community lead on the .netTiers project and a few other open source projects. I have a few ideas that would be sweet to have in this library from a framework design standpoint. Personally. I didn't like the change to how the events are implemented now because it adds extra code, but it does clean up things quite a bit.

Thanks
-Blake Niemyjski
Coordinator
Jan 3, 2009 at 8:33 PM
Ah, cool.  Well all the design discussions, such as they are, have been done here on the forums.   

On the eventing - you mean you did not like the extra code in the library ?
I was trying to keep the usability of the library high, even if it meant more code inside.

What do you think the priorities ought to be?  Maybe this is a topic for a new thread.
  new features (which ones)?
  better quality (where)
  more testing?
  different platform support?

I've been trying to balance new features, where there is demand, with quality.

Jan 3, 2009 at 8:51 PM
Hello,

I didn't like how you had one event and once you subscribe to that you have to check the event type. I wish the three main ones (start, completed...) where still there aswell and marked as obsolete.

new features (which ones), I know that this builds off of the .net framework's zip support but having better zip compression would be sweet.

better quality (where), I think there should be two solutions one with and one without CF support. Otherwise you have to unload the project(s). I will get into more specifics when I have time. Since the current version is version 1.7 and there are still major API changes this scares me. When I upgraded to the latest version for bug fixes (1.6 to 1.7), I ended up doing quite a bit of re factoring. I think that a API review discussion needs to happen and the core needs to be locked down. Please note that I haven't been using dotnetlib for very long, this is just my experiences so far.

more testing? - I see that there are more tests being written which is sweet.

different platform support? - I think what is in 1.7 is great for now. I think that by 2.0 the lib should be solid and not much changes made. just features added onto it after this point.

If you would like to add me to msn, I would be more than happy to help you on your endeavers.

Thanks
-Blake Niemyjski
Coordinator
Jan 3, 2009 at 11:20 PM
Hey Blake - thanks.

To respond...

on Better compression - in v1.7.11, I just factored in a zlib library that does better compression than System.IO.Compression.  It supports compression levels.  With this change, DotNetZip no longer depends on System.IO.Compression.  (ps: there is no zip support, per se, in .NET.  There is DeflateStream, which does compression, but not zip)

on CF support - not sure what you mean by "unload the projects".  I have a single solution with a couple projects for CF and a set of projects for the desktop library.  I don't think you need to unload them.   Maybe I am missing something.  I am hesitant to use 2 distinct solutions, because to my mind that would imply 2 distinct code bases, with the likelihood that diversion would occur pretty quickly.  I am trying to keep a single set of source files for both the CF and the regular "PC"  .NET Frameworks.

On the stability of the interface - yes, the interface continues to change, and I know this means hassle for you and other consumers of the library.  I am trying to manage that hassle.  The ctors with streams is probably the biggest change in older "more established" code. The other interface changes, having to do with zip64 and Unicode - are sort of unavoidable if you want new features.   The big change with the events - there you have a very valid criticism.  I knew it would be a problem.  The initial event design was naive - just not flexible enough to allow the "finer grained" eventing that people had asked for.  I found that you'd have to add 7 or 8 event handlers to the ZipFile instance to get full "fine grained" eventing insight.  I felt like the eventing was new enough that I could justify a redesign.  But yes, it does suck for you.  Ahhh, the pleasures of using "open sores" software.  Even now, I am not sure the redesigned eventing model is right. The fine-grained eventing could imply significant performance cost, for those apps that don't want it.  For example, suppose an app just wants to be notified when the Save starts and finishes, and does not care about the progress.  For a large zip file, the app will be notified 10,000 times about events it does not care about.  What does it cost the app to ignore all these events?  Maybe I should introduce a way for the app to specify which events it wants - similar to the way an ISAPI filter works.  But that means added complexity in the interface, just to eliminate a method call which would be pretty fast anyway. Without serious real-world testing, I am just sort following my instincts here.  (my instinct says, keep it simple and optimize for perf later, if people complain)

But I do see your point - if the interface for DotNetZip does not rapidly stabilize, then people like you will rapidly walk away from the library.   It's a challenge because I want to be agile and responsive to the feature requests, and get them into the current release.  But that responsiveness does not leave much annealing time, if you know what I mean. Not much time to vet the design of the new features with users. 
Maybe there is a way for me to flag the new stuff  as "new"  while keeping the older stuff  more stable. Or maybe that is not a worthwhile approach, and it is better to be a little less agile.

On the upgrade from v1.6 to v1.7 - I agree it is looking like v1.7 is really a v2.0, if not in actual number, then certainly in nature.  There is enough change (removal of ctors, namespace change, eventing model change) and new features (zip64, CF, zlib) that it really should be a different major version number just to communicate that out.    v1.7 got much bigger than I thought it would be.   Can you tell me more about the refactoring in your app?  Was it mainly around the eventing?  anything else?