(Print this page)

Using OpenXML to fill a Word document from a .Net application
Published date: Thursday, March 24, 2011
On: Moer and Éric Moreau's web site

Everybody knows that using Office automation from an application can be a real pain. Even Microsoft itself does not recommend it for server applications (http://support.microsoft.com/kb/257757).

So what are the alternatives? I am personally a big fan of Aspose components. But if you cannot afford it, there are some other ways of dealing with the newer Office documents (now that they moved to a XML format). What I will explore here is a simple method to fill Word documents (version 2007 and later) using the free Open XML SDK. This SDK has nothing to do with automation and can be used on a PC/server that does not have Office installed at all.

Demo Code

This month demo code is available in both VB and C#. It has been created using Visual Studio 2008.

What you need

You will first need to download and install the “Open XML SDK 2.0 for Microsoft Office” from http://www.microsoft.com/downloads/en/details.aspx?FamilyId=C6E744E5-36E9-45F5-8D8C-331DF206E0D0&displaylang=en. In order to be able to run this SDK, you must also at least use Visual Studio 2008 SP1 (and target the Framework 3.5).

Users who will open the filled document will need either Office 2007 or later or an older version with the Compatibility pack (http://www.microsoft.com/downloads/en/details.aspx?familyid=941b3470-3ae9-4aee-8f43-c6bb74cd1466&displaylang=en) properly installed.

Creating a template document

The scenario here is a simple one where we will fill an existing .docx file. That means that this document must already exist. The great thing about a template that we fill at run time is that users can easily modify the content and the formatting without affecting the results (as long as our fields still exist).

Your template must contain a number of MERGEFIELD placeholders. Later, we will find these placeholders to pass values to them.

The biggest benefit of using templates is that a user could provide a template or modify it and your application will still be able to fill as long as the user keeps the same MERGEFIELD placeholders. Because the merge fields are named, you can move them around your documents.

I have created a small template (named Template.docx) using Word 2010. This template contains 3 fields: Date, Time, and FreeText.

This template has been added to the solution and I have set the “Copy to Output Directory” to “Copy always” so that it always gets copied to the bin folder when the solution is compiled.

Creating the project

When you create your project, you need to add a reference to a DLL you installed from the SDK. This DLL is DocumentFormat.OpenXML.dll as shown in figure 1.

Figure 1: Reference a DLL from the SDK

The demo application contains a single form. This form lets you specify a template (Template.docx by default) and the other textbox let you enter a text value that will be pass to the FreeText merge field. Finally, the “Fill Document” button does the entire job (notice that you cannot run the job twice because the template does not contains merge fields anymore, restarting the demo recopies the template).

Figure 2: The demo application running

Here is a breakdown of the important parts of code.

Opening the template

Opening the Word document file is very easy with the SDK. Only the following code is required:

Dim objWordDocx As WordprocessingDocument
objWordDocx = WordprocessingDocument.Open(txtTemplate.Text, True)

Once you have the document open, you need to get a pointer to the part of the document you want to work with. Here, we will be using the main document (but you could also access the header part and/or the footer part for example):

Dim objMainDoc As OpenXmlElement
objMainDoc = objWordDocx.MainDocumentPart.Document()

Looping through the fields

Here is the meal of this technique. You need to find the fields in the template and replace them with some values you have in your application.

The first thing you need to do is to loop through the fields that are declared in your template:

For Each objField As SimpleField In objMainDoc.Descendants(Of SimpleField)()

Inside that loop, you get a reference to a SimpleField object from which you need to extract the field name (GetFieldName is available in the downloadable code – it is a method to extract the field name only):

'Clean the field name
Dim strFieldName As String = GetFieldName(objField)

Now that you have the field name, you can check if it is valid and if you need to process it (in my case, I check if the field exist in dictionary of fields+values). After we know we have a field to process, we have to extract the XML (using the RunProperties object) and append new text to it:

If Not String.IsNullOrEmpty(strFieldName) Then
    'check if we have a value for this merge field
    If values.ContainsKey(strFieldName) AndAlso Not String.IsNullOrEmpty(values(strFieldName)) Then
        'Find the XML placeholder
        Dim objRunProp As String = String.Empty
        For Each objRP As RunProperties In objField.Descendants(Of RunProperties)()
            objRunProp = objRP.OuterXml
            Exit For
        Dim objRun As New Run
        If Not String.IsNullOrEmpty(objRunProp) Then
            objRun.Append(New RunProperties(objRunProp))
        End If
        'add the text to the place holder
        objRun.Append(New Text(values(strFieldName)))

        'replace the merge field with the value
        objField.Parent.ReplaceChild(Of SimpleField)(objRun, objField)
    End If
End If

Saving and closing the document

The last thing you need to do is to save the document. First, I save the part of the document I worked with (could also be the header or the footer). Then, I save the document itself:



Using the OpenXML component will be enough to satisfy many needs of filling Office documents (Word in this case here) without having to install the full application on a server or PC. This also means that you can save the price of this license!

(Print this page)