Stream zip file to WCF Service and unzip on server?

Aug 22, 2011 at 2:07 PM
Edited Aug 22, 2011 at 3:38 PM

Hi

I've had a look at some of the discussions on this library to do with streaming zip files, but I'm not having much luck with my scenario.

What I'm trying to do is this, zip a directory containing .pdf/.txt files and save the zip to a stream.  I then want to send this stream to my wcf service and unzip it to see the contents at the other end.  Also I'm trying to do this without saving anything to the file system if this is possible.

My client side code at the moment looks like this:

using (MemoryStream ms = new MemoryStream())
{
     using (ZipFile zip = new ZipFile())
     {
          zip.Password = "Password";
          zip.AddDirectory("C:\\Test");
          zip.Save(ms);
     }
     
     ms.Position = 0;
     service.UploadFile(ms);
}

As you can see I'm currently saving to a memory stream on the client and then sending this to the service.  I'm receiving the stream on the service end but I'm not sure how to take the stream and turn it into a zip I can read in memory.  I really don't want to copy anything to the file system on the server either if it's possible?

Really appreciate any help or advice you can give.

Thanks

Coordinator
Aug 22, 2011 at 6:52 PM

First, I'd suggest you consider not zipping to a memorystream.  WCF streaming allows the client to write directly to the WCF stream, which means you could save to *that*.  Every byte written gets sent out through the WCF channel stack directly. There's no need to accumulate all the bytes of the zip file in one place, as you are doing with a MemoryStream. 

Second, on the service side, I suppose you want to read the stream. 

public void UploadZipFile(Stream stream)
{
    int n;
    var buffer = new byte[1024];
    var ms = new MemoryStream(); 
    while ((n= stream.Read(buffer, 0, buffer.Length)) > 0)
    {
      ms.Write(buffer, 0, n);
    }
    ms.Seek(0,SeekOrigin.Begin);
    using (var zip = ZipFile.Read(ms)) 
    {
       // ... 
    }
}
Aug 23, 2011 at 9:32 AM

Thanks for the quick reply Cheeso.  The server side code works great thanks!

When you recommend not zipping to the memorystream, are you able to give a sample of how else I'd zip the directory to a writable stream client side please?  I'm not actually streaming the zip file to the wcf service as I am using wsHttpBinding, so the zip file is sent using the buffered transfer mode.

On a side note, I really like this zip library, great work.

Thanks

 

Coordinator
Aug 24, 2011 at 3:53 PM
Edited Aug 24, 2011 at 3:56 PM

I have to admit that I don't have a working WCF install right now, so I cannot test this out. I messed with it for a while yesterday til I threw my hands up in frustration.  How hard is it to just get WCF to work?  I get 404.17 errors, and apparently there are eleventy-seven causes of that problem when using WCF on Windows7, and none of the mitigations i found for those issues were effective for my machine.  so I cannot show you actual working example code. But I'm confident that the  code I'm about to show you is the right idea.

On your client side, I think that you are calling UploadFile() and passing a readable stream.  a MemoryStream works nicely for this purpose.  What I suggested to you is to write directly to the stream, and as you've noticed there's a directional mismatch here.   DotNetZip can write to a writable stream, but WCF on the client side needs to send a readable stream to the server (really, to the client-side proxy method).  The memorystream solves that, but at the cost of needing to accumulate all the data in one buffer, at one time.  This is simple, but it is memory inefficient for large streamed transfers, and why use streaming unless the data is large, right?

But there's a better solution that is memory efficient.  There's a nifty bit of boilerplate logic possible in .NET 3.5 and above that allows you to effectively flip the direction of a stream.  The important bit is this utility method:

static Stream GetPipedStream(Action<Stream> writeAction) 
{ 
    AnonymousPipeServerStream pipeServer = new AnonymousPipeServerStream(); 
    ThreadPool.QueueUserWorkItem(s => 
    { 
        using (pipeServer) 
        { 
            writeAction(pipeServer); 
            pipeServer.WaitForPipeDrain(); 
        } 
    }); 
    return new AnonymousPipeClientStream(pipeServer.GetClientHandleAsString()); 
} 

You need that on the client side. To use it, in your client logic, you will need something Like this:

public void SendTheZip()
{
    UploadFile(GetPipedStream(s =>
        {
            int n;
            byte[] bytes = new byte[1024]; // any size will do
            using (ZipOutputStream zos = new ZipOutputStream(s, true))
            {
                for (int i = 0; i < nFiles; i++)
                {
                    zos.PutNextEntry("File" + i); // the name used for the entry in the zip archive
                    using (FileStream fs = File.OpenRead(filelist[i])) // filelist[i]: a filename
                    {
                        while((n = fs.Read(bytes,0,bytes.Length))>0)
                        {
                            zos.Write(bytes, 0, n);
                        }
                    }
                }
            }
        }));
}

 

What does that do?   Let's break it down.  The UploadFile() is the call to the proxy method for thw WCF service. That's the basic thing you want to do.  It accepts a stream, which must be readable.  The GetPipedStream() method returns a stream, which is readable.  With me so far?

The GetPipedStream takes an Action<Stream>, which is just a typed delegate, a method accepting a stream.  In this case we're giving GetPipedStream an anonymous method, but it need not be so. You could use a named method here. Doesn't matter.  Ok, when GetPipedStream() is called with that Action, what does it do?   It returns an AnonymousPipeClientStream, which is a readable stream.  When someone calls Read() on that stream, that someone reads from the pipe.  What information is it reading?  It reads what is being written to the server-side of the pipe, The converse side of the anonymous pipe, the thing that is "filling" the pipe, in other words writing to the pipe, is the Action - which in this case is the anonymous method you passed.  This anonymous method is the thing that reads a bunch of files and writes them to a ZipOutputStream, effectively zipping those files into an archive.  Still with me?   The output of that ZipOutputStream goes to the server side of the pipe.   Yes?   Are you seeing it?

See, ZipOutputStream is a "decorator".  What I mean is, it's a stream that wraps another stream.  Write into it, and it does some stuff to the data you write in (compresses it), and then writes to the stream it wraps.  Instantiate ZipOutputStream with an output stream, and the bytes of a zipped file go to that output stream.  The anonymous method writes to a ZipOutputStream, which is wrapped around.....the AnonymousPipeServerStream that is created in GetPipedStream().   Yes?

And according to the logic in GetPipedStream(), the anon method writes into that pipe (stream) on a background thread.  So, the upshot is, When you call GetPipedStream(), it creates a pipe, queues up a bg thread to invoke your Action method, and passes the Action the writable pipe.  The Action writes into the pipe.  But the way pipes work, no write can succeed unless and until the client-side reads.   So GetPipedStream() then instantiates the client-side of that pipe (the readable side) and returns it to your code.  Your code calls UploadStream(), which gets that readable stream; UploadStream reads data from the read-side of the anonymous pipe. As the WCF runtime reads that stream, it allows your Action, running on the background thread, to write into the stream .  Do you see?  The BG (writing) thread cannot proceed until someone is reading (excepting the effect of some small degree of buffering).  So as WCF reads the AnonymousPipeClientStream on the main thread, it allows your action method to proceed on the background thread, sending the bytes of a valid zip file through the WCF stream.

Does this all make sense?

Try it, it should work for you.

Keep in mind - this complication is appropriate and necessary when the data you are sending is large.  If you're sending an 8k zip file, then it's probably better to just use a MemoryStream.  This approach will work, but it's more code to maintain, and it's unnecessary. As you get to larger files, that is when the technique I described here, using a pipe and a background thread, will deliver benefits in terms of memory efficiency.

FYI: You can find a writeup describing how to use this technique on the server side here.  It allows you to create a zip on the server side, and transmit it through WCF streams to a requesting client, again, without accumulating all the data to be transferred all in one buffer at one time.  You don't need this server-side code for the problem you're trying to solve here,  but someone in the future might.

 

Coordinator
Aug 24, 2011 at 4:06 PM
Edited Aug 24, 2011 at 4:08 PM

If the anonymous method part is not helpful, you might consider this code. It's equivalent but it uses a named method for the Action that is passed to GetPipedStream.

private void ZipSaver(Stream s)
{
    int n;
    byte[] bytes = new byte[1024]; // any size will do
    using (ZipOutputStream zos = new ZipOutputStream(s, true))
    {
        for (int i = 0; i < nFiles; i++)
        {
            zos.PutNextEntry(Path.GetFileName(filelist[i]));
            using (FileStream fs = File.OpenRead(filelist[i]))
            {
                while((n = fs.Read(bytes,0,bytes.Length))>0)
                {
                    zos.Write(bytes, 0, n);
                }
            }
        }
    }
}


public void SendTheZip2()
{
    UploadFile(GetPipedStream(ZipSaver));
}
        

 

Keep in mind that the Action passed to GetPIpedStream() (in this case, the Action is ZipSaver) is invoked on the background thread. So if your client program is a Winforms or WCF app, you may have UI thread issues to contend with. You cannot update the UI controls from the ZipSaver() method, without doing a cross-thread invocation.

Aug 26, 2011 at 8:02 AM

Thanks a lot Cheeso, you're a legend!  Your reply was just what I was looking for and very helpful. 

I couldn't agree more, it took me a good while to get a WCF service actually working.  I'm pretty sure I understand everything in your replies and I'm sure it will come clearer as I step through the code.

Thanks again!