Using DotNetZip with LINQ (VB)

Reading Zip files: Get the largest-sized entry for each given filename. In some cases a zip file will have multiple files with the same name, stored in different paths. One person had a question about how to select the ZipEntry that corresponded to the largest file for any given filename. This is a job for LINQ!
    Console.WriteLine("querying the modified zip....")
    Using zip1 As ZipFile = ZipFile.Read(ZipPath)
       Dim selection = _
            From e In zip1 _
            Group e By Name = System.IO.Path.GetFileName(e.FileName) Into Group  _
           Select Name, _
           LargestSize = Group.Max(Function(e) e.UncompressedSize)

       For Each s In selection
           Console.WriteLine("{0}", s.ToString())
    End Using

Extracting entries: Extract the largest-sized entry for a given filename. As above, but extract each entry that has the largest size, for any given filename. I don't know the background for this particular requirement, but here's how to do it:
Using zip As ZipFile = ZipFile.Read(ZipPath)
    Dim g2 = _
        From e in zip _
        Group e by Name = System.IO.Path.GetFileName(e.FileName) Into Group

    ' g2 is a GroupedEnumerable

    For Each elt in g2
        Dim x = elt
        ' get the largest of each group
        Dim largest = (From e In x.Group _
                       Where e.UncompressedSize = x.Group.Max(Function(entry) entry.UncompressedSize) _
                       Select e) _
        Console.WriteLine("    {1,8}  {0}", largest.FileName, largest.UncompressedSize)
         If (largest.UncompressedSize > 0) Then
             largest.FileName = elt.Name
         End If
End Using

Creating a ZipFile: selecting files to add into a ZIP, via LINQ. This is a common need, and a perfect job for LINQ, although it seems many people are still not aware of how useful LINQ can be for these sorts of jobs. Suppose I want to zip up a Visual Studio solution directory, but I'd like to omit all the files in the various bin and obj directories. A simple way to accomplish that is to just perform a msbuild /t:Clean to remove all those files before zipping the entire directory, but... if for some reason cleaning the build output is not acceptable, you can use LINQ to select the specific files to include.
  Public Sub Run()
    Dim baseDirectory = "c:\dev\MySolutionDirectory"
    Dim di as New DirectoryInfo(baseDirectory)

    '' Use LINQ to select all files that are not in the bin or 
    '' obj directories.
    Dim fileList = From f In di.GetFiles("*.*", SearchOption.AllDirectories) _
	    Where Regex.IsMatch(f.FullName,"\\(bin|obj)\\") = False _
	    Select f.FullName 

    '' Create the zip file using DotNetZip, with that selection of 
    '' filesystem files.
    Using zip as New ZipFile ()
      '' Using ZipFile.AddFiles will include the full path in the entry inside the 
      '' zip. Probably not desired.

      For Each f as String in fileList
        Console.WriteLine("  {0:23} ==> {1}", f, GetRelativePath(f,baseDirectory))
        '' Add the file entry, specifying a path to use within the zip. The path is 
        '' the FullPath of the filename, with the baseDirectory removed from the
        ''  beginning. 
        zip.AddFile(f, GetRelativePath(f,baseDirectory))
    End Using 
  End Sub

  Private Function GetRelativePath (fileName as String, baseDir as String) As String
    Dim f As String = Path.GetDirectoryName(fileName)
    f = f.Replace(baseDir, "")
    '' trim any leading backslash
    If f.StartsWith("\") Then
      f = f.Substring(1)
    End If
    Return f
  End Function

Last edited Nov 13, 2010 at 4:40 PM by Cheeso, version 10


No comments yet.