How to read/extract single file from ZIP

Mar 27, 2010 at 6:49 AM
Edited Mar 27, 2010 at 10:07 AM

Dear all,

Although i have not explored DNZ in detail yet, but i didn't found a simple code for extracting a single file by name from zip file, can you please let me know how to do it.

i want to do is like this :

i have sample.zip file which have multiple files in it. At any particular time, i want to extract a single file, randomly not sure which file is required. say file1.txt file2.txt and i want to read file2.txt so is there any function like :

zipfile.read("ZipFileName.zip")

zipfile.extract("file1.txt") ?

or

zipfile.read("file1.txt") as io.stream etc ?

Please let me know how to do this

VBNewComer

Problem solved this way :

Dim iozip As Ionic.Zip.ZipFile = New Ionic.Zip.ZipFile("MyZip.zip")
        MsgBox(iozip.Count)
        For Each ZipEntry In iozip
            If ZipEntry.Info.Contains("printed.prn") Then

                MsgBox("File Found : " & ZipEntry.FileName)

                'Extract File
                ZipEntry.Extract(FileIO.FileSystem.CurrentDirectory, Ionic.Zip.ExtractExistingFileAction.OverwriteSilently)
                MsgBox("File : " & ZipEntry.FileName & " Extracted")

end if

next

 

Coordinator
Mar 27, 2010 at 3:37 PM

I'm glad you found the solution.

There are plenty of examples in the .chm documentation and at http://dotnetzip.codeplex.com/documentation .

Mar 27, 2010 at 4:16 PM
Edited Mar 27, 2010 at 4:18 PM

Dear Cheeso,

Thanks for quick response. And many thanks for such wonderful library/addons.

Please consider to add such option. I'm still finding one more solution, want to read zip file from FTP, directly and extract only one file's contents.  I don't want to download whole zip file, rather search for one file, download that file only.

VBNewComer

Coordinator
Mar 28, 2010 at 10:25 PM

I don't know if that would be possible from FTP.

But such a feature really does not belong in a ZIP library.  The library reads from a System.IO.Stream, so if you can model the file as a stream, then DotNetZip can read from it. 

Some time ago, Another user suggested I add a feature that allows DotNetZip to read from an HTTP endpoint.  This was a similar request, and I replied similarly.  That feature doesn't belong in a zip library. So this person built an HTTPRangeStream, which can read (with seek operations) over HTTP.  This allows DotNetZip, or any application, to read a document over HTTP, as a stream. 

See http://cheeso.members.winisp.net/srcview.aspx?file=HttpRangeStream.cs 

This allows code like so: 

HTTPRangeStream s = new HTTPRangeStream("http://Myserver.com/alpha/beta/gamma.zip");
using ( ZipFile z = ZipFile.Read(s))
{
    bool wantOverwrite= true;
    z["MyEntryName.txt"].Extract("unpack", wantOverwrite);
}

If you can produce or find a similar stream for FTP, then you would have the capability you seek.  I'm not sure it's possible, though. I don't think FTP supports the RANGE option as HTTP does.  On the other hand, maybe you are happy with the HTTPRangeStream.

 

Mar 29, 2010 at 2:11 PM

Yes, i'm little dumb in VB too, so not efficient to convert c# code too,  thanks for your good suggestion.

I think it's good idea to develop another project which depends on DNZ and it essentially extend other functionality like other users demands, like this one. So that it will create all seperate library which will be useful to others, in same discpline and it saves time of all the users and benifits more robust way to re-produce too.

So this way, while users suggests, it become feature for that add-on/library and that suggestion/reply permanently becomes part of that library, which will strengthen this and new project.

Please do take it as humble suggestions only,

 

Coordinator
Mar 29, 2010 at 3:15 PM

In VB.NET, the code I gave above would look like the following:

Dim s As New HTTPRangeStream("http://Myserver.com/alpha/beta/gamma.zip")
Using zip As ZipFile = ZipFile.Read(s)
    Dim wantOverwrite as Boolean = True
    zip("MyEntryName.txt").Extract("unpack", wantOverwrite)
End Using

I appreciate the suggestion about a different project that would provide additional function, as desired by users.

I'm not sure if you're familiar with the principles of OO Design, but one of the most foundational ones is the Principle of Single ResponsibilityIt means that each class should have one responsibility. 

Now, DotNetZip knows how to manage zip files.   It depends on other classes for other things - like IO.  If you want to read a zip file from an FTP server, that is really just the combination of two things - reading a file from an FTP server, and reading a ZIP file.  According to this design philosophy, which has developed over the years as the preferred way to design classes, the ZIP manager (like DotNetZip) really should have no knowledge of FTP protocols, nor HTTP protocols, nor any other network protocols.  The ZIP manager needs only to know how to manage zip files.  It knows how to read a zip file from a stream.  That stream might be a network stream, or an HTTP stream (via a class I showed you, above), or some other stream.   But the IO function is de-coupled from the zip function, according to the Principle of Single Responsibility.

Therefore I suggest that if you want to read a zip file from an FTP site, you should first find or build something that abstracts the FTP content as a stream. The zip manager (DotNetZip) should not know anything about FTP.

There may be such a project.  I don't know, you should search for one.  If the FtpStream is correctly implemented, then it will be easy, very very simple, to use it with DotNetZip.  In fact it would be very easy to use such an FtpStream class with ANY project that can utilize a System.IO.Stream.  This is the goal of the use of abstractions, such as System.IO.Stream - to enable the independent construction of distinct pieces of software that work together, though they were designed separately. 

Last thing - maybe you are not clear on this.  A class such as HttpRangeStream, though it is writen in C#, can be easily used in an application that is written in VB.NET.  You need not convert the C# library code in order to use it in VB.NET.   Maybe you are already clear on this.  HttpRangeStream is no different than DotNetZip in that respect - each can be used from any .NET language.

----

So in summary:

  1. consider whether the remote ZIP file can be read via HTTP rather than FTP.  If so, you have HTTPRangeStream, and can use that immediately.
  2. If HTTP is not acceptable, look for or develop a reusable FtpStream class, which abstracts an FTP resource as a stream.

 

 

Mar 29, 2010 at 4:47 PM

Thanks Cheeso, I do agree with your views, and very much sure that it is never a responsibility of zip, not it should be. That is the only reason I suggested to implement such other projects which depends on same concept, it just depends on DNZ, by this way, other functions will added as per user requirement, by users.

Thanks again for great help.

Coordinator
Mar 29, 2010 at 9:59 PM
Edited Mar 30, 2010 at 7:49 PM

See http://cheeso.members.winisp.net/srcview.aspx?dir=streams&file=FtpReadStream.cs

This class is a read-only stream, layered on top of the FTP protocol.

To use this class:

Using (s As FtpReadStream = New FtpReadStream("Myserver.com", "/alpha/beta/gamma.zip")
    s.User = "UserName"  ''optional
    s.Password = "Secret!"  '' optional
    Using zip As ZipFile = ZipFile.Read(s)
        Dim wantOverwrite as Boolean = True
        zip("MyEntryName.txt").Extract("unpack", wantOverwrite)
    End Using
End Using

The FtpReadStream supports Read() and Seek() operations on the FTP-accessible file.  But, a Seek() followed by a Read() will be slow, because the class sets up a new socket connection every time that happens, and that takes 1 or 2 seconds.  When reading a zip file, that may happen 6 or 7 times, therefore the performance of the FtpReadStream will be very low, in comparison to reading a local file with FileStream.  You can mitigate this somewhat by wrapping the FtpReadStream in a BufferedStream. 

Even so, there is a benefit for very large zip files.  The effect of using the FtpReadStream as a source for DotNetZip is this:  The ZipFile.Read() method seeks to *near* the end of the file, to get the zip directory.  It downloads only the portion of the zip file that contains the directory, usually less than 80 bytes for each file in the zip.   If you then want to extract a file (or entry), DotNetZip will Seek() in the stream, and the FtpReadStream will download only  the bytes necessary for that particular entry.  

Using this stream, instead of just downloading the entire file via FTP and reading it locally, makes sense when the time-cost of seeking in the ftp stream is smaller than the time-cost to download and discard the unwanted bytes in the zip file.  For large zip files it makes sense. For smaller zip files it will be slower.

 

Mar 31, 2010 at 3:01 PM

on ZipFile.Read(s), it gives error :  IO exception unhandled : Can't check for file existence

Coordinator
Mar 31, 2010 at 5:07 PM

What does?  Listen I'd like to help but I can't unless you provide more information.  A stack trace would be nice.  A code snip. 

I am not at your computer, so I don't know what you're doing. I can't see your code. I can't see the exception you're getting. For me to help you'll have to share some of that information.

Mar 31, 2010 at 6:22 PM
Sorry for improper quote/question. I really appreciate your efforts to help others; below is summary :

Using (s As FtpReadStream = New FtpReadStream("Myserver.com", "/alpha/beta/gamma.zip")
    s.User = "UserName"  ''optional
    s.Password = "Secret!"  '' optional
    Using zip As ZipFile = ZipFile.Read(s)
        Dim wantOverwrite as Boolean = True
        zip("MyEntryName.txt").Extract("unpack", wantOverwrite)
    End Using
End Using

in above code,in Using zip as... line ZipFile = ZipFile.Read(s) <---- Here error.

System.IO.IOException was unhandled
  Message="Malformed PASV reply: 227 Entering Passive Mode (74,55,128,146,240,161) "
  Source="myTest"
  StackTrace:
       at myTest.FtpReadStream.CreateDataSocket() in C:\Users\user\Documents\myTest\myTest\ftpReadStream.vb:line 631
       at myTest.FtpReadStream.Read(Byte[] buffer, Int32 offset, Int32 count) in C:\Users\user\Documents\myTest\myTest\ftpReadStream.vb:line 344
       at Ionic.Zip.SharedUtilities._ReadFourBytes(Stream s, String message)
       at Ionic.Zip.SharedUtilities.ReadInt(Stream s)
       at Ionic.Zip.ZipFile.VerifyBeginningOfZipFile(Stream s)
       at Ionic.Zip.ZipFile.ReadIntoInstance(ZipFile zf)
       at Ionic.Zip.ZipFile.Read(Stream zipStream, TextWriter statusMessageWriter, Encoding encoding, EventHandler`1 readProgress)
       at Ionic.Zip.ZipFile.Read(Stream zipStream, TextWriter statusMessageWriter, Encoding encoding)
       at Ionic.Zip.ZipFile.Read(Stream zipStream)
       at myTest.FormmyTest.ButtonZipTest_Click(Object sender, EventArgs e) in C:\Users\user\Documents\myTest\myTest\FormmyTest.vb:line 1726
       at System.Windows.Forms.Control.OnClick(EventArgs e)
       at System.Windows.Forms.Button.OnClick(EventArgs e)
       at System.Windows.Forms.Button.OnMouseUp(MouseEventArgs mevent)
       at System.Windows.Forms.Control.WmMouseUp(Message& m, MouseButtons button, Int32 clicks)
       at System.Windows.Forms.Control.WndProc(Message& m)
       at System.Windows.Forms.ButtonBase.WndProc(Message& m)
       at System.Windows.Forms.Button.WndProc(Message& m)
       at System.Windows.Forms.Control.ControlNativeWindow.OnMessage(Message& m)
       at System.Windows.Forms.Control.ControlNativeWindow.WndProc(Message& m)
       at System.Windows.Forms.NativeWindow.DebuggableCallback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)
       at System.Windows.Forms.UnsafeNativeMethods.DispatchMessageW(MSG& msg)
       at System.Windows.Forms.Application.ComponentManager.System.Windows.Forms.UnsafeNativeMethods.IMsoComponentManager.FPushMessageLoop(Int32 dwComponentID, Int32 reason, Int32 pvLoopData)
       at System.Windows.Forms.Application.ThreadContext.RunMessageLoopInner(Int32 reason, ApplicationContext context)
       at System.Windows.Forms.Application.ThreadContext.RunMessageLoop(Int32 reason, ApplicationContext context)
       at System.Windows.Forms.Application.Run(ApplicationContext context)
       at Microsoft.VisualBasic.ApplicationServices.WindowsFormsApplicationBase.OnRun()
       at Microsoft.VisualBasic.ApplicationServices.WindowsFormsApplicationBase.DoApplicationModel()
       at Microsoft.VisualBasic.ApplicationServices.WindowsFormsApplicationBase.Run(String[] commandLine)
       at myTest.My.MyApplication.Main(String[] Args) in 17d14f5c-a337-4978-8281-53493378c1071.vb:line 81
       at System.AppDomain._nExecuteAssembly(Assembly assembly, String[] args)
       at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence assemblySecurity, String[] args)
       at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
       at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
       at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
       at System.Threading.ThreadHelper.ThreadStart()
  InnerException: Nothing


        
    
Coordinator
Mar 31, 2010 at 6:35 PM

Ah, ok.  The FtpReadstream is failing.

The exception clearly shows it's failing in the parsing of the PASV reply.  Seems like your FTP server has appended an extra space at the end of the reply, which is causing the parse to fail.

To fix this, in place of:

    string ipData = resp.Reply.Substring(index1+1,index2-index1-1);

please insert:

    string ipData = resp.Reply.Trim().Substring(index1+1,index2-index1-1);

Not sure, but from here it seems like that ought to fix your current problem.

Stepping back, it looks like you've ported the FtpReadStream code to VB. Great! But listen, this is your code now. I can't really "support" that sample. I only offered it as an example of what I was talking about. In the future, you're going to have to figure this stuff out on your own.

Good luck!

Coordinator
Mar 31, 2010 at 6:52 PM
Edited Mar 31, 2010 at 6:55 PM

Actually, no.  Looking at that code again, the parsing should be simplified, like this:

private Socket CreateDataSocket()
{
    Trace("setting up datasocket");
    var resp = SendCommand("PASV");

    if(resp.Status != FtpStatusCodes.EnteringPassiveMode)
        throw new IOException(String.Format("Protocol error. Unexpected reply ({0})", resp.Reply.Substring(4)));

    int index1 = resp.Reply.IndexOf('(');
    int index2 = resp.Reply.IndexOf(')');
    if (index1 < 0 || index2 < 0)
        throw new IOException(String.Format("Protocol error. Can't parse PASV reply ({0})", resp.Reply.Substring(4)));

    string ipData = resp.Reply.Substring(index1+1,index2-index1-1);

    Trace("ipData: [{0}]", ipData);

    int[] parts = Array.ConvertAll(ipData.Split(','), x => Int32.Parse(x));

    if (parts.Length != 6)
        throw new IOException(String.Format("Malformed PASV reply [{0}]", resp.Reply));

    string ipAddress = parts[0] + "."+ parts[1]+ "." +
        parts[2] + "." + parts[3];

    int port = (parts[4] << 8) + parts[5];

    Socket s = new Socket(AddressFamily.InterNetwork,SocketType.Stream,ProtocolType.Tcp);
    IPEndPoint ep = new IPEndPoint(Dns.GetHostEntry(ipAddress).AddressList[0], port);

    try {
        s.Connect(ep);
    }
    catch(Exception exc1) {
        throw new IOException("Can't connect to remote server.", exc1);
    }

    return s;
}

Edit: Ahhh, but I think you can't do anonymous methods in VB.NET until .NET 4.0. I'm not sure though...

Anyway it looks like you have replied so I will stop editing this now...

Mar 31, 2010 at 6:53 PM

Dear Cheeso,

I'm not really that much sharp to develop it from c#, it's converted thru SharpDevelop (http://www.icsharpcode.net/OpenSource/SD/) which has facility to convert any code to VB/Python/Boo etc. And it does very well, in-fact, same type of zip project is there from where i came here and got excellent help from you. Sorry for bothering you, this didn't help out. Anyway, thanks for your help, will try to solve it myself, although not sure.

Thanks again for your excellent help.

Dim ipData As String = resp.Reply.Trim().Substring(index1 + 1, index2 - index1 - 1)
        Dim parts As Integer() = New Integer(5) {}

        Dim len As Integer = ipData.Length
        Dim partCount As Integer = 0
        Dim buf As String = ""

        Dim i As Integer = 0
        While i < len AndAlso partCount <= 6

            'Dim ch As Char = [Char].Parse(ipData.Substring(i, 1))
            Dim ch As Char = [Char].Parse(ipData.Trim().Substring(i, 1))
            If [Char].IsDigit(ch) Then
                buf += ch
            ElseIf ch <> ","c Then
                Throw New IOException("Malformed PASV reply: " & resp.Reply)
            End If

            If ch = ","c OrElse i + 1 = len Then
                Try
                    parts(System.Math.Max(System.Threading.Interlocked.Increment(partCount), partCount - 1)) = Int32.Parse(buf)
                    buf = ""
                Catch generatedExceptionName As Exception
                    Throw New IOException("Malformed PASV reply: " & resp.Reply)
                End Try
            End If

 

Coordinator
Mar 31, 2010 at 7:00 PM

Right, you need to simplify all that parsing.  That's what I tried to do using String.Split() and Array.ConvertAll().  

I don't know why it's failing but you should be able to figure it, with a little testing.

The only tricky part is the anonymous method in the call to Array.ConvertAll, in VB.  You'd have to convert that to a named function, I guess.

 

Coordinator
Mar 31, 2010 at 7:12 PM
Edited Mar 31, 2010 at 7:13 PM

I think, something like this would work:

    Private Shared Function MyConverter(arg As String) As Int32
        Return Int32.Parse(arg)
    End Function

        ....
        Dim ipData As String = resp.Reply.Trim().Substring(index1 + 1, index2 - index1 - 1)
        Dim parts As Int32() = Array.ConvertAll(ipData.Split(Chr(44)), AddressOf MyConverter)

        If parts.Length <> 6 Then
            Throw New IOException(String.Format("Malformed PASV reply [{0}]", resp.Reply))
        End If
Dec 12, 2014 at 8:46 AM
Dear Cheeso Brother, Please help to read the zip file located at FTP server. I don't want to download whole file, only want to read entries in it.