This project is read-only.

Multi-threading deflation fails in certain cases

Jan 7, 2014 at 3:30 AM
I noticed a situation when deflating doesn't write all the data into the output file. Usually, everything is fine and I didn't figured out yet what conditions should meet for this to happen but I found at least one file. It is an MP3 file. When I create a new zip file (var objZipFile = new ZipFile(Encoding.UTF8)), then add this file (objZipFile.AddFile(objFile.PathName, String.Empty)), then call objZipFile.Save(objArchiveFile.PathName) with approximately 70 - 80% probability the result zip file will be incomplete and I won't be able to inflate it back.
I tracked the problem down to the ParallelDeflateOutputStream.Write method along with _DeflateOne method. ParallelDeflateOutputStream.Write spawns new tasks (threads) with ThreadPool.QueueUserWorkItem( _DeflateOne, workitem ) which in it's turn enqueues result buffers as _toWrite.Enqueue(workitem.index). Then EmitPendingBuffers takes its turn to actually write enqueued buffers and here is a problem with synchronization takes place. ThreadPool.QueueUserWorkItem is called as many times as necessary to deflate the whole file but let's say the last 3 tasks (threads) are not started yet. The control goes to the EmitPendingBuffers to finalize the deflation and it writes all buffers which were ready by now but after that the last 3 tasks starts working processing the last chunks of the data which do not end up into the output file.
Did somebody notice this problem before and already have a solution how to fix it?

Thank you,
Jan 7, 2014 at 3:46 PM
It seems like this did the trick
        } while (doAll && (_lastFilled != _latestCompressed || _lastWritten != _latestCompressed));
but this is not entirely clean. The outer loop doesn't have waiting for _DeflateOne completion. Continue looking into it.