Problem with opening zip file created from memory stream

Jul 2, 2009 at 7:52 AM

After placing some pdf files taken from database as stream into zip file and directing it as a stram too just into browser .... there is a problem with openinig a zip file with windows compressed folders or winrar ....
it works fine in total commander. During winrar opening one can see 0 CRC sum of pdf files ....after changing a name of the file CRC is calculated and everything works great.

It looks like there is no CRC calculated in my routine. Your help would be apreciated.
BartZip

public byte[] GetRptFile(ArrayList rptFilesIds)
{
if (rptFilesIds.Count == 0)
return null;

byte[] result=null;


if (rptFilesIds.Count == 1)
{
QueriesTableAdapter dtQuery = new QueriesTableAdapter(this.connection

String);
result = dtQuery.GetRptFile(Convert.ToInt32(rptFilesIds[0])) as byte[];
return result;
}

ZipFile zip = new Ionic.Zip.ZipFile();

zip.CompressionLevel= Ionic.Zlib.CompressionLevel.BestCompression;


for (int i = 0; i < rptFilesIds.Count; i++)
{
QueriesTableAdapter dtQuery = new QueriesTableAdapter(this.connectionString);
result = dtQuery.GetRptFile(Convert.ToInt32(rptFilesIds[i])) as byte[];


string folder =dtQuery.GetZipFolderName(Convert.ToInt32(rptFilesIds[i]));
string fileName=dtQuery.GetRptFileName(Convert.ToInt32(rptFilesIds[i])) as string;

folder = folder.Replace(" ", "");
fileName = fileName.Replace(" ", "").Replace(" ", "_").Replace(".", "-").Replace(":", "-") + ".pdf";

folder = null; // ;-)

MemoryStream ms = new MemoryStream(result);
zip.AddEntry(fileName, folder, ms);
}

MemoryStream msr = new MemoryStream();


zip.Save(msr);

byte[] zipbyte = new byte[msr.Capacity];

//msr.Position = 0;

msr.Seek(0, SeekOrigin.Begin);
msr.Read(zipbyte,0,msr.Capacity);


return zipbyte;
}
Coordinator
Jul 2, 2009 at 12:44 PM

I'm not clear on what you are doing - you mentioned "directing it as a stream into browser", but I don't see anything having to do with the browser or Response.OutputStream in your code.

I do see a bunch of things in your code that relate to outside code - "mystery code".   Before I will be able to help you very much, you need remove some of those variables to isolate the role of DotNetZip in the logic.  With the code you have sent, it is not at all clear that the problem you are experiencing is directly related to DotNetZip.  There are too many moving parts.

I can make a few suggestions, though.

If you want to save a zip to the browser from an ASP.NET app, then I suggest calling zip.Save(Response.OutputStream).   By writing to a MemoryStream, all of the zip content must be realized in memory before any byte is written to the browser.  This can result in a great deal of memory stress on a loaded system .  It can be much faster to write directly to Response.OutputStream.

Of course you may have good reasons to save to a MemoryStream - maybe you are caching the result.  In that case I have some hints here as well.  Rather than doing this:

    byte[] zipbytes= new byte[msr.Capacity];
    //msr.Position = 0;
    msr.Seek(0, SeekOrigin.Begin);
    msr.Read(zipbytes, 0, msr.Capacity);
    //return zipbytes;

Can I suggest you do something like this:

    byte[] zipbytes = msr.ToArray();

If you use the former method, you will get a byte array which is end-padded with many zeroes. I don't think you want that, as it can be larger than the aggregate size of the input files, which defeats the purpose of compression.

Internally, within the MemoryStream, there is a byte[]. As you write into the stream, this byte array gets re-allocated, it "grows". The Capacity of the MemoryStream is the maximum length of the internal buffer, while Length gives the length of the content in the stream. Suppose I new a MemoryStream, and (suppose) it allocates an initial buffer of 1024 bytes. If I write 1025 bytes to the MemoryStream, it will grow, and possibly double its internal buffer in size to 2048 bytes. The Capacity will be 2048, but the Length will be 1025. The ToArray() method will get a byte array of the correct Length, with no trailing zeroes. Using your method, you will get a byte array of length 2048, with 1023 bytes of zero at the end.

The bad news is that I was unable to reproduce the problem you described relating to the missing CRC.   It *could* be related to the trailing zeros; It depends on how those other tools read zip files.  But I tried it with the Windows shell compressed folders, and had no problems.

As I said earlier, it seems to me there are many steps in the process, and I am not at all clear on where the error occurs.  I suggest that you take one step back and debug the process in pieces.  Is it really true that the CRC is missing when you first create the zip stream?  Are there trailing zeros in the zip stream?  The "result" that you get from GetRptFile() - is it valid content to start with?  These are things I cannot help you with. 

If you can produce a test case in a console app, that clearly demonstrates the problem, OR a very clear description of how I can do the same, I would be happy to look at it again. 

 

 

 

Jul 2, 2009 at 2:05 PM

Thanks for your reply ...

You are right ......the code is not clear enough .....

I used your suggestion and took

  byte[] zipbytes = msr.ToArray();

It made zip file smaller twice :-)

but the main problem exists ....

There are 2 layers in my system ....

one gives pdf files database access with

method:    (simplified)

public byte[] GetReportFromDatabase()
        {
            MemoryStream msr = new MemoryStream();
            using (ZipFile zip = new ZipFile())
            {
                zip.CompressionLevel = Ionic.Zlib.CompressionLevel.BestCompression;
                for (int i = 1; i <3; i++)    // only 2 pdf files
                {                  
                    // PDF file from database (faked)
                    byte[] result =   new byte[] {1,2,3,4} ;
                    string fileName = "Rpt" + i.ToString() + ".pdf";     
                    MemoryStream ms = new MemoryStream(result);
                    zip.AddEntry(fileName, null, ms);
                }              
                zip.Save(msr);
            }          
            byte[] zipbyte = msr.ToArray();
            return zipbyte;
        }

 

and second layer ...ASPX page with main method:

 protected void GetReportTEST()
        {
            string fileName = "TwoPdfFiles.zip";
            byte[] filedata = GetReportFromDatabase();

            if (filedata != null && filedata.Length > 0)
            {
                Response.Clear();
                Response.ContentEncoding = System.Text.Encoding.UTF8;
                Response.ContentType = String.Format("text/{0}", fileName.Substring(fileName.LastIndexOf('.')));
                Response.Cache.SetCacheability(HttpCacheability.Private);
                Response.Expires = -1;
                Response.Buffer = false;
                Response.AddHeader("Content-Disposition", string.Format("{0};FileName=\"{1}\"", "attachment", fileName));
                Response.OutputStream.Write(filedata, 0, filedata.Length);
                Response.End();
            }

        }

 

and now it works .....:-)

but byte[] filedata got only 4 bytes .....

and my pdf files are about .... 622 331 Kb

seems that my routine works ....but problem is with files size ?

Bart

 

 

 

Coordinator
Jul 2, 2009 at 2:24 PM

Sorry, I'm not understanding the problem.   It works with small byte arrays.  But you're having a problem with larger files. What if you used a large byte array?  What ifyou used a byte array of 620k?  etc etc

All the same questions I asked you before, are still interesting, and unanswered:  Are you sure the CRC is missing?  Are you sure the PDF is valid when you get it?  Can you reduce the problem to very simple blocks to debug it?   Etc etc. 

What if you zip the PDFs from a console/command line tool, using similar code?  Does it succeed?  What if you zip a regular PDF stored in the filesystem?  Does it fail?  etc

You have a gap between the 4 byte array and the 620k thing retrieved from the database.  How can you bridge the gap. this is a matter of troubleshooting and debugging your code.

 

 

Jul 2, 2009 at 2:44 PM

I think ..the problem is in pdf names ....

Kolenkiewicz__Blazej_MultiSelect_Raport_indywidualny_(interpretacja_slowna)_2009-05-19_10-42.pdf

Kolenkiewicz__Blazej_MultiSelect_Raport_indywidualny_(wyniki_podstawowe)_2009-05-19_10-42.pdf

 

I shorten ... these names to Rport1.pdf and Report2.pdf .... and it worked

When I open zip file with these long names ... WinRar shows ....an error and pdf CRC values .... = 0  so I thought there is something wrong with my routine and CRC is not calculated for pdf

when you change only one letter in name ...in winrars window ..it's CRC is recalculated and then one can open zip ...with any unzip program

 

I will try to find out what is wrong with these names ....brackets ?  they are to long ? ....

thanks for your help  and patience ...

Bart

 

 

 

Coordinator
Jul 2, 2009 at 2:47 PM

That's interesting!

Are you Polish? My mother was Polish...

Jul 2, 2009 at 2:55 PM

Yep ... I am ...

that is nice that people with Polish roots do make such a good software tools.

Best Regards

 

Coordinator
Jul 2, 2009 at 2:58 PM
Edited Jul 4, 2009 at 5:30 PM

I don't think it is strictly the name that is the problem.  it may be related, but it is not the only problem. 

I renamed one of my pdf's to use the name you gave, and when I zipped up that file, it worked just fine in WinZip.

TinyPic