(Print this page)

Compression in the .Net Framework 2.0
Published date: Thursday, February 1, 2007
On: Moer and Éric Moreau's web site

The version 2.0 of the .Net Framework has added built-in compression features namely the DeflateStream class and the GZipStream class. These new features are part of the System.IO.Compression namespace.

The bad news is that none of these two classes is supporting what you are probably looking for: the .Zip format. If it is your only point of interest, you may skip right now to the section titled “Other tools”.

The similarities

Why 2 classes you ask? Simply because each class is using a different algorithm. Both classes have a very similar implementation. They can often be interchanged without much pain.

As you can read from there name, expect both classes to work only with other streams (of up to 4 GB) and not directly with files.

The DeflateStream class

This class is using the patent-free Deflate algorithm (a combination of the LZ77 algorithm and Huffman coding).

The GZipStream class

The algorithm used here is the same as in the DeflateStream class except that cyclic redundancy check (CRC) is included to detect data corruption (which is more reliable). A stream compressed with this algorithm will always be a bit larger because of this overhead.

While many tools like WinZip and WinRar can deal with files created with this algorithm, this class cannot uncompress files of these tools!

The demo application

The demo application created for this article let you compress/uncompress a file you select to/from the algorithm you choose.

Figure 1:The demo application

I will not paste the complete code here so please download the sample application.

Compressing a stream

The compression process is using 3 streams: 1 used as input, 1 for output, and one for compression. Input and output streams can be just about any kind of streams. My demo application is using FileStream.

The first thing you need to do is to open a stream that will be used to read your original file:

' Open the input file as a FileStream object.
stmInFile = New FileStream(pFileName, FileMode.Open)
You also need another stream that will contain your output file which will be compressed:
' Open the output file as a FileStream object.
stmOutFile = New FileStream(pFileName + strExtension, FileMode.Create)
The last stream you need is the one that does the actual job of compression. Normally, this is also a single line of code but because my sample lets you choose your algorithm, it is a bit more complicated:
' Create a new stream for the compressed data.
Dim stmCompressed As Stream
If pAlgorithm = enuAlgorithm.Deflate Then
    stmCompressed = New DeflateStream(stmOutFile, CompressionMode.Compress)
Else
    stmCompressed = New GZipStream(stmOutFile, CompressionMode.Compress)
End If
You need to tell constructor of this stream where to write the result of the operation and the mode which is set to Compress here.

Once your 3 streams are open, you need to loop through the input stream and write to the output stream through the compression stream:

Dim arrBuffer(4095) As Byte
Do
    'Read a chunk of the input file
    intReadBytes = stmInFile.Read(arrBuffer, 0, arrBuffer.Length)
    'Detect the end of file
    If intReadBytes = 0 Then Exit Do
    'Write the compressed chunk
    stmCompressed.Write(arrBuffer, 0, intReadBytes)
Loop
That’s all you need to compress a stream.

You may find shorter version of it but be aware that shorter version will have serious problems with large file. This version will be able to process file up to 4 GB (the limitation of this class).

Uncompressing a stream

The decompression process is also using 3 streams.

The first 2 streams (FileStream in my sample) are declared exactly the same way as for the compression process. The input stream contains the compressed stream.

The stream that does the actual process of decompression is declared like this:

' Create a new stream for the uncompressed data.
Dim stmCompressed As Stream
If pAlgorithm = enuAlgorithm.Deflate Then
    stmCompressed = New DeflateStream(stmInFile, CompressionMode.Decompress)
Else
    stmCompressed = New GZipStream(stmInFile, CompressionMode.Decompress)
End If
Because we are decompressing a stream, we pass the input stream in the constructor (we decompress from the input stream).

You are now ready to loop through the input to decompress the stream and write into the output stream:

Dim arrBuffer(4095) As Byte
Do
    'Read a chunk of the input file
    intReadBytes = stmCompressed.Read(arrBuffer, 0, arrBuffer.Length)
    'Detect the end of file
    If intReadBytes = 0 Then Exit Do
    'Write the compressed chunk
    stmOutFile.Write(arrBuffer, 0, intReadBytes)
Loop
The resulting stream will be identical to your original you provided first.

Other tools

If you need to handle real Zip files, you are probably very disappointed that the classes explained here are not suitable.

There are many solutions out there. I will link you to 2 libraries.

The first one is a commercial product called Xceed Zip for .Net from a company of my geographical area.

The second library is available for free. It is called SharpZipLib.

Conclusion

These classes are well suited for your internal use when you are already handling streams and you want to limit the size of them because you stored them on limited size storage or because you send them over slow/overcharged network.

I hope you appreciated the topic and see you next month.


(Print this page)