Extract from Stream VS ExtractAll

Dec 26, 2009 at 4:10 PM

I was trying to use streams to extract files from a compressed zip coming from a browser in a stream.      I got the code to work, but the files that were being pulled out of the zip seemed to be corrupted.   I was using the ZipEntry.Extract method.   I was never able to get it to work.   However, using the Zip.Extract to extract the same files to a specific directory and then read them to store them into the database worked just fine.   This is the code that did not work.    Any ideas?   I would prefer to be able to do this using Streams.

 

        //private NameValueCollection ExtractZipAndAddFiles(System.IO.Stream ArchiveStream)
        //{
        //    NameValueCollection FileStatus = new NameValueCollection();

        //    using (ZipFile zip = ZipFile.Read(ArchiveStream))
        //    {
        //        foreach (ZipEntry entry in zip.Entries)
        //        {
        //            System.IO.MemoryStream FileStream = new System.IO.MemoryStream();

        //            entry.Extract(FileStream);

        //            bool Result = AddFile(FileStream, (int)entry.UncompressedSize, entry.FileName, System.IO.Path.GetExtension(entry.FileName));

        //            FileStream.Close();

        //            FileStatus.Add(entry.FileName, Result.ToString());
        //        }
        //    }

        //    return FileStatus;
        //}

 

 

Coordinator
Dec 26, 2009 at 5:46 PM

Some comments for you:

  • when extracting zip files, whether into a stream or into a filesystem file, there's a CRC check that happens.  If the check fails, DotNetZip throws an exception.  If you haven't gotten an exception, then the corruption is happening in your code.
  • a possible cause for the corruption is that you haven't performed a Seek() on the MemoryStream after extracting the filedata into it.  I don't know what your "AddFile" method does, but if that method tries to read the FileStream parameter without first resetting the pointer on the stream, it will fail.  It will not read the uncompressed file data.  It will read <nothing>. 
  • It doesn't make sense to extract the entry into a MemoryStream, only to then read from the MemoryStream.   Why not just read it directly?   You can use ZipEntry.OpenReader() to get the stream directly.  It automatically uncompresses as you read.
  • you should use a using() clause on objects that are IDisposable, like Streams.  This eliminates the need to call Close() explicitly, and protects you in case an exception occurs during processing.

 

private NameValueCollection ExtractZipAndAddFiles(System.IO.Stream ArchiveStream)
{
    NameValueCollection FileStatus = new NameValueCollection();

    using (ZipFile zip = ZipFile.Read(ArchiveStream))
    {
        foreach (ZipEntry entry in zip.Entries)
        {
            using (Stream s = entry.OpenReader())
            {
                var result = AddFile(s, (int)entry.UncompressedSize, entry.FileName, System.IO.Path.GetExtension(entry.FileName));
                FileStatus.Add(entry.FileName, result.ToString());
            }
        }
    }

    return FileStatus;
}


Dec 26, 2009 at 6:18 PM

Resetting the pointer on the Stream did the trick.   I figured that it was going to be something simple like that...

Thanks for the suggestions, I do appreciate it.

Chaitanya