Extracting Single File to Stream

May 16, 2011 at 4:32 PM

I'm working on a project where we need to extract a single file from a zip file and we would prefer to not write the file out to disk.  In searching I found DNZ and it appears to be a very full functioning set of utilities that cover everything that we would need to process zip files.

The code I am using is:

Using zip1 As ZipFile = ZipFile.Read(ZipToUnpack)
    Dim i As ZipEntry
    For Each i In zip1
        If i.FileName.Contains(".NAR") Then
            Dim snar As System.IO.Stream
            i.Extract(snar)

            'Process streamed file
        End If
    Next
End Using

The error I get is "Invalid input. Paramter name: outstream" when the i.Extract is performed.

In researching the issue, I have tried using the following example I found to use the i.OpenReader but it appears to not produce the same type of stream that I need:

Using zip1 As ZipFile = ZipFile.Read(ZipToUnpack)
    Dim i As ZipEntry

    For Each i In zip1
        If i.FileName.Contains(".NAR") Then
            Using s As Ionic.Zlib.CrcCalculatorStream = i.OpenReader
                Dim n As Integer
                Dim buffer As Byte() = New Byte(4096) {}
                Dim totalBytesRead As Integer = 0
                Do
                    n = s.Read(buffer, 0, buffer.Length)
                    totalBytesRead = (totalBytesRead + n)
                    Loop While (n > 0)
                        If (s.Crc <> i.Crc) Then
                            Throw New Exception(String.Format("The Zip Entry failed the CRC Check. (0x{0:X8}!=0x{1:X8})", s.Crc, i.Crc))
                        End If
                        If (totalBytesRead <> i.UncompressedSize) Then
                            Throw New Exception(String.Format("We read an unexpected number of bytes. ({0}!={1})", totalBytesRead, i.UncompressedSize))
                        End If

                    'Process docNarr = New Document(s)
            End Using
        End If
    Next
End Using

If I could use the i.Extract(stream) it should solve all of my issues, but for some reason it won't work.  Also, if I write the file out to the disk first and then bring it back in the process works as expected. 

Any ideas?

 

Thanks!

 

Coordinator
May 19, 2011 at 6:44 PM

You have defined snar as a Stream, but it's not assigned a value in your code, as far as I can tell.

You could extract into a MemoryStream, a FileStream, or some other type of writable stream.  To use a MemoryStream you'd do something like this:

Using zip1 As ZipFile = ZipFile.Read(ZipToUnpack)
    Dim e As ZipEntry
    For Each e In zip1
        If e.FileName.Contains(".NAR") Then
            Using snar As new System.IO.MemoryStream
                e.Extract(snar)
                snar.Seek(0,System.IO.SeekOrigin.Begin)
                'Process streamed file here
            End Using
        End If
    Next
End Using

But I'm not sure what you want to do about "processing" the snar. You said you wanted to extract it but didn't say where you wanted the data to be placed.

I guess that part is up to you.

Good luck.

 

May 19, 2011 at 7:22 PM

In my code, snar is defined as a System.IO.Stream which is what the extract method expects.

When it comes to the processing, I am using a utility from ASPOSE which allows me to read the file (an RTF) and then grab text out of the file.  It will accept a stream but I am having problems getting System.IO.Stream back from the Extract method.

Any ideas?

Coordinator
May 19, 2011 at 9:20 PM

I'm not clear on ... System.IO.Stream is an abstract base class. You cannot ever construct one. The way it works: you instantiate a derived type - like a MemoryStream, FileStream, NetworkStream or something else.  The System.IO.Stream is just an abstraction, a way to deal with all of those things via the same methods (Read, Seek, Write, etc). http://msdn.microsoft.com/en-us/library/system.io.stream.aspx

In your code, you declare snar as a Stream.  But it isn't initialized.  Not in the code I saw.  It needs to hold reference to an instance of a concrete class.  You've only declared it.

I don't know what the ASPOSE thing has to do with it.  I'm not clear how you want that to work with DotNetZip.

If you want a readable stream for a ZipEntry - in other words, you want to read an entry out of a zip file (and decompress as you read) - then call ZipEntry.OpenReader() in lieu of ZipEntry.Extract().   OpenReader gives you a readable stream. Extract() writes content into a writable stream, that you must have previously opened.  

If you're not clear on that - think of a stream as a conduit, a pipe that carries liquid.  Some pipes can emit output - you can draw liquid from them.  Some streams cannot emit output, but instead can carry liquid away, in other words they can accept input.  The former are like Readable streams, the latter like writeable streams.  Which kind of stream do you want?  If you want to suck output out of the zipfile, then call ZipEntry.OpenReader.  Then invoke stream.Read() on what you get back (See the doc for full examples).  If you already have a writable stream - something that accepts input like a MemoryStream or a FileStream opened for output - then call ZipEntry.Extract() , passing your writable stream.

 

May 31, 2011 at 6:12 PM
Edited May 31, 2011 at 6:15 PM

Thank you, zipentry.OpenReader() solved my problem!  I was trying to extract a compressed XML file without using a memory stream or temp file.

This is what worked for me:

 

using (var zipfile = ZipFile.Read(fileName)) {
   foreach (var zipentry in zipfile) {
      var xmlDoc = XElement.Load(zipentry.OpenReader());
      ...
   }
} 

 

I guess what was not clear to me during my attempts (where I tried InputStream) is that OpenReader would stream the uncompressed contents.

Jan 20, 2012 at 10:53 PM
Edited Jan 20, 2012 at 10:54 PM

I don't get what you mean by "OpenReader gives you a readable stream."

I'm trying to extract a zip file that contains a single txt file, directly to the output stream to make it downloadable. I was able achieve it with the code:

using (ZipFile zip = ZipFile.Read(path))
{
    var OutputStream = new MemoryStream();
    zip[0].Extract(OutputStream);
    OutputStream.Seek(0, SeekOrigin.Begin);

    Response.AddHeader("content-disposition", "attachment; filename=" + filename);
    return File(OutputStream, "text/plain");
}

Nevertheless I thought I could do it (in a better way) like this:

using (ZipFile zip = ZipFile.Read(path))
{
    Response.AddHeader("content-disposition", "attachment; filename=" + filename);
    return File(zip[0].OpenReader(), "text/plain");
}

But this code raises the error "The stream is not readable.". Shoudn't it be exactly the opposite?