(Print this page)

Introducing System.IO.Packaging
Published date: Tuesday, August 12, 2008
On: Moer and Éric Moreau's web site

In my February 2007 article titled “Compression in the .Net Framework 2.0”, I have introduced you to two compression classes (DeflateStream and GZipStream). None of those classes is supporting the Zip format.

Since the release of the .Net Framework 3.0, we are now able to handle .zip files. For sure it is an interesting feature because the .zip format is surely the most used compression format. Microsoft provides the System.IO.Packaging class but it is so deeply hidden that most of us missed it.

This new class suffers serious limitations. So read on before you start digging yourself into it.

This is my first article that specifically targets the .Net framework 3.0 and above. To create this demo application, I have used Visual Basic 2008 but you could do it with Visual Basic 2005 if you have installed the .Net Framework 3.0.

Setting the reference

When I learned the existence of the System.IO.Packaging class, I first opened the reference dialog to find a component named something like System.IO.Packaging (admit that it would have make sense since it is the name of the class) and I couldn’t find it.

Because I didn’t find anything close to that, I started to doubt that this class really existed. After spending some time googling around, I found that you need to add a reference to the “WindowBase” component as shown in figure 1.

Figure 1: Setting the reference

Zipping files

The operation is not really straight-forward, you cannot simply call a Zip method. That would have been to easy!

The first thing you need to do is to open a Package (you will find this code in the click event handler of the btnZip control):

Dim objZip As Package = ZipPackage.Open(kZipFile, FileMode.OpenOrCreate, FileAccess.ReadWrite)

Once you have the package opened, you can start adding file to your package (you will also find this code in the click event handler of the btnZip control):

AddFileToZip(objZip, IO.Path.Combine(Application.StartupPath, "Test1.txt")) 
But the AddFileToZip method is not a method of the Packaging class, it is a method you need to write! This method is in charge of transforming the file you want into a byte array, create a package part, add your byte array to this part. This will give you something like this:
Private Sub AddFileToZip(ByVal pZip As Package, ByVal pFileToAdd As String)
    'Validate if the file exists
    If Not File.Exists(pFileToAdd) Then
        Throw New FileNotFoundException("AddFileToZip cannot process file " + pFileToAdd)
    End If

    'Create a URI from the filename to zip (to ensure the name is valid)
    Dim partURI As New Uri(CreateUriFromFilename(pFileToAdd), UriKind.Relative)
    'Create a Package Part
    Dim pkgPart As PackagePart = pZip.CreatePart(partURI, _
                                                 Net.Mime.MediaTypeNames.Application.Zip, _
                                                 CompressionOption.Normal)
    'Read the file into a byte array
    Dim arrBuffer As Byte() = File.ReadAllBytes(pFileToAdd)
    'Add the array of byte to the Package
    pkgPart.GetStream().Write(arrBuffer, 0, arrBuffer.Length)
End Sub
I really think they could have done that for us!

A first limitation I see here is the performance of this process on very large files. And don’t even think on trying to zip petabytes with this method because it will bomb. The Length property of the array is of type Int32. I could have used the LongLength property instead which is of type Int64.

The first test

You are now ready to do a first test and see some results. My demo application zips 2 text files that are in the debug folder. The resulting file is sent to the C:\Temp folder and is name MyZipTestFile.zip. When opening it with WinZip (or any other application that can handle that kind of file), you will discover another surprise: a file named [Content_Types].xml (see figure 2).

Figure 2: The content of the zip file

This file has been magically added by the Packaging class. As you can see in figure 3, this file simply list all file types included into your zip file.

Figure 3: Content of the “magic” file

So far this file reacts like a real zip file. You can even extract the files from it without any problems.

Unzipping files

Now that we have zipped files, it would surely be helpful to unzip them.

The click event handler of the btnUnzip control does some validation to ensure that the zip file exists and also that the output folder (where files will get extracted) exists (otherwise it is automatically created for you). This output folder is named Test in your debug folder. After those validations, the ExtractFromZip method is called.

This method reads like this:

Private Sub ExtractFromZip(ByVal pZipFilename As String, ByVal pOutputPath As String)
    Using pkgMain As Package = Package.Open(pZipFilename, FileMode.Open, FileAccess.Read)
        For Each pkgPart As PackagePart In pkgMain.GetParts()
            Dim strTarget As String = Path.Combine(pOutputPath, CreateFilenameFromUri(pkgPart.Uri))
            Using stmSource As Stream = pkgPart.GetStream(FileMode.Open, FileAccess.Read)
                Using stmDestination As Stream = File.OpenWrite(strTarget)
                    Dim arrBuffer(10000) As Byte
                    Dim intRead As Integer
                    intRead = stmSource.Read(arrBuffer, 0, arrBuffer.Length)
                    While intRead > 0
                        stmDestination.Write(arrBuffer, 0, intRead)
                        intRead = stmSource.Read(arrBuffer, 0, arrBuffer.Length)
                    End While
                End Using
            End Using
        Next
    End Using
End Sub
This method opens the zip file, retrieve each part (or file) from it and use streams to write back to files.

The second test

Now run the application again and click the Unzip button.

The content of your zip file will be extracted into a folder called Test into the startup folder (probably the debug folder).

Discovering a serious limitation

So far, we have been able to create zip files and read them with Winzip (or any other compatible application) and unzip those files.

Now, find the zip file your application just created (probably in c:\temp) and add file to it. Now run the application again and click the Unzip button. Check the output folder (Test in the debug folder) and look carefully. Your newly added file is not there! For me this is a very serious limitation. That means that an application can zip files and extract from them but you will never be able to manually add files to the zip files.

But if you overwrite a file in your zip file and you extract from it, you will get the new content.

Other libraries

If you find that this implementation does not fit your needs, you have at least 2 other free libraries:

If you really want to spend money (send it to me!) or if you have reached other limits (like the size of the file), you may want to look at the Xceed Zip for .Net component.

Notice that all libraries listed above are working for application using the .Net Framework 2.0.

Conclusion

I have heard that this class was basically developed to support Office 2007 files. For those who don’t already know the trick, rename a Office 2007 file (.docx, .xlsx, .pptx, ...) to .zip and double-click on it. You will then see all the xml files that are used.

The implementation is not complete and serious limitations exist. I would not recommend using this class and download another compression component.

I think it is the first time I tell you not to use one of the class but considering that the implementation is not complete, that serious limitations exist, and that free excellent alternatives are available, I really think I needed to tell you.


(Print this page)