[Help] Inflating problem when decompress a zipped string from a txt file.

Aug 8, 2011 at 12:07 PM
Edited Aug 8, 2011 at 12:20 PM

Hello my friends,

I would like to decompress a zipped string from a txt file with the Ionic.Zlib, but i have the "Inflating" Exception :(

I have download the sample code "ZlibStreamExample.cs"

The sample code runs well, whitout problem.

But, if i store the compressed text on a simple txt file, and if i read this file in order to decompress with Ionic.Zlib, i have the Inflating problem.

 

Here is my code :

 try
            {
                System.IO.MemoryStream msSinkCompressed;
                System.IO.MemoryStream msSinkDecompressed;
                ZlibStream zOut;
                String originalText = "Hello, World!  This String will be compressed... ";

                System.Console.Out.WriteLine("original:     {0}", originalText);

                // first, compress:
                msSinkCompressed = new System.IO.MemoryStream();
                zOut = new ZlibStream(msSinkCompressed, CompressionMode.Compress, CompressionLevel.BestCompression, true);
                CopyStream(StringToMemoryStream(originalText), zOut);
                zOut.Close();

                // at this point, msSinkCompressed contains the compressed bytes
                
                // Write the compressed text to a file named zipped.txt
                StreamWriter writer = new StreamWriter(@"C:\zipped.txt");
                writer.Write(MemoryStreamToString(msSinkCompressed));
                writer.Close();

                // Read the zipped.txt file and load the text on a string variable
                string compressed;
                StreamReader reader = new StreamReader(@"C:\zipped.txt");
                compressed= reader.ReadToEnd();
                System.IO.MemoryStream streamCompressed = new System.IO.MemoryStream();
                streamCompressed = StringToMemoryStream(compressed);


                // now, decompress:
                streamCompressed.Seek(0, System.IO.SeekOrigin.Begin);
                msSinkDecompressed = new System.IO.MemoryStream();
                zOut = new ZlibStream(msSinkDecompressed, CompressionMode.Decompress, true);
                CopyStream(streamCompressed, zOut);

                // at this point, msSinkDecompressed contains the decompressed bytes
                string decompressed = MemoryStreamToString(msSinkDecompressed);
                System.Console.Out.WriteLine("decompressed: {0}", decompressed);
                System.Console.WriteLine();

                if (originalText == decompressed)
                    System.Console.WriteLine("A-OK. Compression followed by decompression gets the original text.");
                else
                    System.Console.WriteLine("The compression/decompression cycle failed.");
            }
            catch (System.Exception e1)
            {
                Console.WriteLine("Exception: " + e1);
            }

Anyone have an idea why the decompression failed when i reader my compressed strinf form a file ?

I have no problem if all stay in memory.

It's an encoding problem ?

 

Thanks a lot,

 

Best regards,

 

Nixeus

 

Coordinator
Aug 8, 2011 at 2:10 PM

StreamReader and StreamWriter deal with text data.

Compressed streams are binary. You cannot use StreamWriter() to write binary data into a file. I think you know that, but you tried MemoryStreamToString() on your compressed data, which is not going to work.  It will not produce a valid string, and it will not preserve all of the binary data. The analogous problem occurs on reading the compressed data.

so.... Don't use StreamWriter and StreamReader on compressed data streams.  For writing a byte stream to a file, try File.WriteAllBytes.

Some additional comments - you wrote the compressed data into a file called "zipped.txt".  But a compressed data stream is not text, so you should not call it ".txt".  For the extension, use .bin or .zlib or something to indicate that it is compressed data in a binary format.   Also the ZlibStream does not produce a "zipped" file. It is a compressed file but that is not the same as "zipped".  The ZIP format is well known and is not produced by ZlibStream.  So the filename "zipped.txt" is doubly-confusing.  It should be something like "content.zlibbed" .  I don't know of a convention for the filename extension, for a file containing zlib data stream.  A file with the extension .txt  is a text file; the OS assumes notepad can open it.  A filename with the .zip extension is a zipfile, which can be opened as a compressed folder in Windows. And so on. There is no convention that I know of for a file containing zlib-compressed content.  The extension .Z is already taken - for another format. So you will have to make one up.  .cmp might work (cmp => "compressesed").

Also - there are convenience methods on the ZlibStream class that give you a compressed byte array from a string, and vice versa.  See ZlibStream.CompressString. It may save you some code.

Using that knowledge you could shorten your code considerably...

byte[] compressed = ZlibStream.CompressString("Whatever");
File.WriteAllBytes("c:\\content.zlib", compressed);

byte[] raw = File.ReadAllBytes("C:\\content.zlib");
string s = ZlibStream.UncompressString(raw);

Aug 8, 2011 at 2:37 PM

Hello and thanks a lot for your answer,

 

It's most easy to understand :)

In fact, my programme receive an xml string by a webservice.

First, i decode the bas64 string to a string ( i tested this function and the 64 base decoding is OK), secondly, i would like to uncompress my string, so according to your post i coded this :

     byte[] bArrayCompressed = System.Text.Encoding.ASCII.GetBytes(strToDecompress);
     string uncompressed = ZlibStream.UncompressString(bArrayCompressed);
     return uncompressed;
My error is : Inflating : rc=2
I have tried to replace the "Encoding.ASCII" by "Encoding.UTF8" and i have :
* Bad check header
My XML string is compressed with the ZLIB and i cannot modify this.
Have you on idea please ?
Thanks a lot,
Best regards,
Nixeus

 

 

Coordinator
Aug 8, 2011 at 2:56 PM

You said: you decode the base64 string to a string.

But wait. Imagine there is a binary datastream.  The owner of this datastream then applies a base64 encoding on it, producing a string - a text representation of the binary data. 

ok, this is what the owner of the data sends to you, I guess - a base64 string.  Decoding that string produces... a binary data stream.  A set of bytes.  Not a string.

I don't know how you got a string out of the base64 string, but it is probably wrong. You don't show the magic for how you get strToDecompress, above. You said you checked it, but you didn't say how.  I think that transformation is the source of your problem here.

I will repeat the key observation from my earlier reply: compressed data is not a string.  The fact that you named your variable strToDecompress indicates to me that you have not understood this.  The variable name indicates it is a string, and that it will be "decompressed".  In other words it is a "compressed string".  But as I told you, that makes no sense. When you compress a string (or any other type of content) with ZLIB you get a series of bytes, not a string.

The output of your base64-decode should be a byte array, or equivalent.  Not a string! That byte array is what you must use as input to ZlibStream.UncompressString(). 

 

 

Aug 8, 2011 at 3:09 PM

In fact, it's very simple :

 

I have coded a serviceweb in a proprietary language.

I create my XML string, and i encode it on Base64 and i send it on the network.

In the other hand, on my c# program i store my answer on a string variable, next, i decode the base64, and next i need to uncompress them.

I compared the compressed XML sended and the XML compressed received : they are the same, so, the base64 decoding is OK, the problem is the decompressing and i don't know why.

 

In addition, i have try to compress and decompress with GZIP and i don't have any problem, so my network and my base64 deconding are ok.

 

Thanks a lot for your help :)

 

Best regards,

 

Nixeus

 

Coordinator
Aug 8, 2011 at 3:57 PM

you wrote:

>I create my XML string, and i encode it on Base64

where is the compression step? Have you left something out?

And the output of the base64 decoder is... of what type?  You are insisting that the base64 decode step is fine, that "you checked it".  I understand.  you said that earlier.   But I asked how you checked it, and you did not explain.  Are you sure your check is valid? Explain it to me.

You wrote:

> I compared the compressed XML sended and the XML compressed received.... 

what is "compressed XML" ?   What is the format of your "compressed xML"?  Is it a string?  or a byte array?  something else?

-------

Suppose I do this:  I take a string, the text of the Magna Carta for example.  Now I compress it using zlib, which produces a byte stream.  I then (incorrectly) try to produce a string from that bytestream, by using, for example System.Text.Encoding.ASCII.GetString(byteStream) .  This may throw an exception, or it may not, I don't know.  but let's suppose it does not throw an exception.  The resulting string "representation" of the compressed byte stream is of no value.  The transformation is not reversible, for the reasons I explained earlier - compressed data is not a string.  Therefore I have lost information when I did the ASCII.GetString().  The output is a string, but it is not a valid representation of the compressed data, and there is no way to get the original compressed data from the string.  I can send the string over the network, and compare the received string with the original compressed string that was sent.  They may be equal, but they are equally useless. Do you see?

Ok, start over.  Suppose I compress the text of the Magna Carta using zlib.  I get an array of bytes. Base64 encoding was designed to allow a string representation of binary data.  Knowing that, I use a base64 encoder, and produce a string representation of the compressed bytestream.  So far, I compressed something (plain text), which produced an array of bytes; then I transformed the byte array into an ASCII representation (another string).  (The base64 transformation loses much of the size reduction benefit of the compression transformation, but that is another matter).  The output base64 string is a faithful representation of the compressed data.  I can fully recover the original compressed data by applying the base64 decode operation. Now, suppose I transmit this string over the network. The receiver gets a base64-encoded string, a representation of the compressed byte stream.  I can then do the base64 decode, getting the compressed byte array.  Then I can decompress the byte array, which delivers the original text of the Magna Carta.

-----

YOUR CODE on the receiving side, apparently, somehow decodes the base64 string, and retrieves a string from that decode operation.  I'm telling you that is an invalid step.  You cannot produce a string from a base64-encoded string.  You can produce only a byte array.  I asked how you got a string from the base64 string, but you didn't respond. Are you sure you are doing this correctly? 

------

also - If it works using GZIP, and you are controlling both ends of the wire, why not use GZIP and be done with it?

 

Aug 8, 2011 at 4:16 PM
Edited Aug 8, 2011 at 4:21 PM

Hello,

 

Sorry for my msitake.

Here is my 64Decode function :

string base64Decode(string data)
        {
            try
            {
                System.Text.UTF8Encoding encoder = new System.Text.UTF8Encoding();
                System.Text.Decoder utf8Decode = encoder.GetDecoder();

                byte[] todecode_byte = Convert.FromBase64String(data);
                int charCount = utf8Decode.GetCharCount(todecode_byte, 0, todecode_byte.Length);
                char[] decoded_char = new char[charCount];
                utf8Decode.GetChars(todecode_byte, 0, todecode_byte.Length, decoded_char, 0);
                string result = new String(decoded_char);
                return result;
            }
            catch (Exception e)
            {
                throw new Exception("Error in base64Decode" + e.Message);
            }
        }

I want to use ZIP because it compress better than GZIP, i have tested it ;)

My webservice return me an answer in a string value, and not a byteArray ( It's logic because it's frome the network).

Coordinator
Aug 8, 2011 at 4:55 PM
Edited Aug 8, 2011 at 5:00 PM

You have produced a method called base64decode(), that produces... a string! But this is invalid. The output of a base64-decode transformation is not a string. The idea behind that method of yours is wrong and doomed to fail.

The output of Convert.FromBase64String() is a byte array.  very clear, yes?
In your method, you try to UTF8-decode that byte array.  This is incorrect.  It will not work. It is unnecessary.  You're doing it wrong.

If i understand what you told me, that byte array represents the result of a ZLIB compression.  You cannot UTF8 decode that.  It is NOT A STRING !!!!! 

I don't know how to say it more clearly:  It is not a string.  It is not a byte array that represents a string.  You cannot text-decode it.   It is not a string. It is not a string.  It is not a string.  It is not a string. It is not a string. It is not a string. It is not a string. It is not a string. It is not a string. It is not a string. It is not a string. It is not a string. It is not a string. It is not a string. It is not a string.

Let me start over.

You told me that, on the sending side, you have some XML.  You then applied a ZLIB compression of the XML to obtain a byte array.  This byte array IS NOT A STRING.  IT IS NOT A UTF8 ENCODED REPRESENTATION OF A STRING. It is a byte array containing compressed data.  ok, then to that byte array you applied a base64-encoding to obtain a base64 string.  Is all of this correct?   If so, on the receiving side you must follow the reverse of those 2 steps: Base64 decode.  Then decompress with ZLIB, giving you the original XML string.

The output of the base64-decode step is a BYTE ARRAY.  The byte array IS NOT A STRING.  IT IS NOT A UTF-8 ENCODED REPRESENTATION OF A STRING.   You must not try to UTF8 Decode it.  You must only zlib-decompress the byte array.  That decompression will produce the original XML string. 

Like this:

Sending side:
XML -> ZLIB Compression ==> byte array
byte array -> Base64-encode  ==>  base64 string

....Transmit the base64 string....

Receiving side:
base64 string -> Base64-decode ==> byte array
byte array -> ZLIB decompression == >  XML

(for the above, say the words "transform via" for "->"
and say "giving" for "==>")

-----

In code, on the receiving side, it looks like this:

byte[] toUncompress = Convert.FromBase64String(data);
string xml = ZlibStream.UncompressString(toUncompress);

You don't need your  base64decode() method.

-----

Also - I asked why you did not simply use GZIP instead of ZLIB, and you remarked that you use ZIP (not ZLIB) because it compresses better.  I am confused on several points here.  Your code shows you using ZLIB, not ZIP.  It may be a simple typo on your part when you said you use ZIP.  But secondly:  It is not correct to say that ZLIB compresses "better" than GZIP.  They use the same compression algorithm - exactly the same.  There is only a very small difference between ZLIB and GZIP, the small amount of data surrounding the raw compressed data - you might call it metadata.   It may be that your GZIP compressor is not very efficient, when compared to Ionic.Zlib.ZlibStream.     But if you use Ionic.Zlib.GZipStream, it will deliver the same (very good) compression as Ionic.Zlib.ZlibStream. GZIP and ZLIB are not distinct compression algorithms; they differ only in metadata leaders and trailers.

 

Aug 8, 2011 at 7:13 PM
Edited Aug 8, 2011 at 7:25 PM

Thanks a lot,

 

I applied your advice and it's better : I can see my xml translated !

Juste a question : Ionic.zlib use ZIP algorythme ? Or GZIP Algorythme ?

Because if i compress with the GZIP including in the .Net Framework, i have differences !

Just another question....do you know a simple trick in order to replace a "sequence of bytes" on a byte Array ?

In my byte array, i need to replace  the sequence of bytes  &#60; by < and &#62; by >

I need to do that because the server use a proprietary translating methode, so in after the byte[] toUncompress = Convert.FromBase64String(data);
 and before the uncompressing, i need to fo my translation. I cannot use a string.replace because it's a byteArray, so is there a simple method ?

 

 Thanks again :)

 

Best regards,


Nixeus

 

 

Coordinator
Aug 8, 2011 at 7:48 PM

You're welcome.

The Ionic library uses the DEFLATE algorithm for GZipStream, ZlibStream, and DeflateStream.  The System.IO.Compression.DeflateStream and GZipStream also use the DEFLATE algorithm  Within the algorithm , there are some options for doing compression more or less effectively. The System.IO.Compression classes make different choices regarding compression in the algorithm, which means it performs differently, in size and speed terms.  What I mean is, the compressed output from Ionic.Zlib.GZipStream will differ from the compressed output from System.IO.Compression.GZipStream, for the same input.   But though the compressed forms differ, the Ionic and System.IO.Compression use the same algorithm, and they are interoperable.  You can demonstrate this yourself: the compressed output from Ionic.Zlib.GZipStream will be decompressible with System.IO.Compression.GZipStream, and vice versa.  The same is true for the corresponding DeflateStream classes.  

Regarding replacing bytes - here again I think you are still a little off base.  The &#60 is an HTML-escape sequence representing a < . And the &#62 is the escape sequence for  >  .  These are character sequences, and will be present in the string that results from decompression.  At that point, it IS a string.  Decompression gives you a string (NOT a byte array), because a string was used as the original data.  So of course you can do String.Replace() on that decompression result .   side note: It seems to me you are not completely clear on the concept of strong data types in the .NET Framework.  You might consider reading up on that topic, a little.  Also look into String encoding. Maybe try this article: http://www.dijksterhuis.org/encoding-c-strings-as-byte-byte-arrays-and-back-again/ 

Rather than doing String.Replace, though, you might want to try out HtmlDecode:  http://msdn.microsoft.com/en-us/library/7c5fyk1k.aspx .  It's builtin to the .NET Framework, and is designed for the purpose of replacing escape sequences in strings.

Aug 8, 2011 at 10:22 PM

Thanks a lot, all is ok :)

I juste seen that the base64 grow the size of the "message" to send...it's sad

Maybe thaht this question is stupid, but , do you know if there is a way to re-compress a base64 string to a base64 COMPRESSED string ?

Look like a circle no ? Or there is a possibility ?

 

Thanks again :)

Coordinator
Aug 8, 2011 at 11:32 PM

as you have already guessed, there is no possibility to re-compress a base64 string which itself is the representation of a compressed byte array.

My suggestion: Make the compressed form (the byte array) is as small as it can be, by using the highest level of compression.  When you base64-encode it, it will increase in size, but that is impossible to avoid if you are using text in transmission.

To avoid the use of text and thus the need to base64 encode, you could consider a binary transport, or if you are using HTTP, consider gzipping the entire HTTP transaction - at the transport layer rather than at the application layer.  For example: http://yyosifov.blogspot.com/2010/02/applying-gzip-compression-to-wcf.html

 

Aug 9, 2011 at 5:34 PM

Yes !

I had forget a .trim() on my string to base64 function ( on my server application).

Now, my XML answer is 27kb, the compressed is 5kb, and the compressed in base64 is 6kb ! Perfect :)

I will just made some test in order to see what is the better with Zlib and DotNetZip comparing GZIP and ZIP.

I will try to find the high level compression methode with these libs !

Thanks again for all your helps :)

 

Best regards,

 

Nixeus