I had this free library on my “to-test-list” for a very long time. This morning, I decided it was the time. And sincerely, I should have done it time long ago!
This component is mainly used to transform any text file (fixed length or delimited) to a structured type or a datatable. But its power doesn’t stop there!
The demo application
This demo is provided in both VB and C#. It was created using Visual Studio 2010 but you really need the version 2.0 of the Framework (so VS 2005 and up).
Figure 1: The demo application in action
Credits
I cannot take the credit for this very powerful library. You should really visit the official web site to check if a newer version is released (latest is dated from April 10, 2007) and to read the documentation and check for other examples.
TextFieldParser class from the .Net Framework
Do you remember my article of May 2010 in which I introduced you to the TextFieldParser class? They both have pros and cons. There is no clear winner. Your input file will dictate you which component to use. For example, the TextFieldParser has a CommentTokens property which is very useful if your input file has comments.
So what is the biggest advantage of the FileHelpers Library? My favorite feature is that your file becomes a strongly type resources bringing you the correct name and data type. No more typing error in your column’s index.
Handling errors
We haven’t seen anything about the library yet and I will already introduce you about one of the properties. This is a nice property that help dictates how to handle the parsing of errors that may occur while opening your file. This property is ErrorMode:
engine.ErrorManager.ErrorMode = ErrorMode.SaveAndContinue
Only 3 values are available:
My demo here will use the SaveAndContinue value. The demo will simply list the rows in error.
I have introduced one erroneous row in the delimited input file and 2 incorrect rows in the fixed-width input file.
Processing delimited files
The first type of file we will process is a delimited text file.
The first thing you will need is a class to define the files that will be read.
This is an example of a class that fits the TestDelimited.txt file provided with my demo application:
Imports FileHelpers <DelimitedRecord(",")> Public Class cTemplateDelimited Public Id As Integer Public FullName As String Public Height As Decimal <FieldConverter(ConverterKind.Date, "yyyy-MM-dd")> Public DOB As DateTime End Class
The first thing we see at the top is an attribute specifying that the file will be delimited and the character used as the delimiter. In this case it is set to a comma but it can be just about any characters. Then, you see the 4 fields that will be processed from the input file if the parsing is valid. As you can see, data will be parsed in the correct data type right away. You can even see that the date field will be parsed using a specific format.
Now that we have this class, we can create the code to open the file using this definition.
These 20 lines of code represents what is required to be able to provide a parser, open the file, display rows in error if any, and display the valid rows.
'Clean the content of the listbox to display the new results Me.ListBox1.Items.Clear() ListBox1.Items.Add("Processing delimited file") 'create the parser engine providing the type (the class) to use Dim engine As New FileHelperEngine(GetType(cTemplateDelimited)) 'Specifying the kind of error processing engine.ErrorManager.ErrorMode = ErrorMode.SaveAndContinue 'read the file Dim result As cTemplateDelimited() = DirectCast(engine.ReadFile("TestDelimited.txt"), cTemplateDelimited()) 'display some summary results ListBox1.Items.Add(" Number of records: " + engine.TotalRecords.ToString) ListBox1.Items.Add(" Successful: " + result.Length.ToString) 'report only if parsing errors are found If engine.ErrorManager.ErrorCount > 0 Then ListBox1.Items.Add(" Errors: " + engine.ErrorManager.ErrorCount.ToString) For Each err As ErrorInfo In engine.ErrorManager.Errors ListBox1.Items.Add(String.Format(" Error: {0} (row {1})", err.ExceptionInfo, err.LineNumber)) Next End If 'loop through each valid row For Each row As cTemplateDelimited In result ListBox1.Items.Add("ID: " + row.Id.ToString) ListBox1.Items.Add(" Name: " + row.FullName) ListBox1.Items.Add(" Height: " + row.Height.ToString) ListBox1.Items.Add(" DOB: " + row.DOB.ToString) Next
Because the library parses the input file into a collection of strongly type object, I can now use row.FullName for example to access valid information.
Processing fixed-width files
Processing a fixed-width file is exactly the same. You need to provide a class used to parse the file like this one:
Imports FileHelpers <FixedLengthRecord()> Public Class cTemplateFixed <FieldFixedLength(5)> Public Id As Integer <FieldFixedLength(21), FieldTrimAttribute(TrimMode.Right)> Public FullName As String <FieldFixedLength(7), _ FieldConverter(GetType(TwoDecimalConverter))> _ Public Height As Decimal <FieldFixedLength(15), FieldTrimAttribute(TrimMode.Right)> Public Comment As String <FieldFixedLength(10), FieldConverter(ConverterKind.Date, "yyyy-MM-dd")> Public DOB As DateTime ' Custom Converter Friend Class TwoDecimalConverter Inherits ConverterBase Public Overrides Function StringToField(ByVal from As String) As Object Dim res As Decimal = Convert.ToDecimal(from) Return res / 100 End Function Public Overrides Function FieldToString(ByVal from As Object) As String Dim d As Decimal = CType(from, Decimal) Return Math.Round(d * 100).ToString() End Function End Class End Class
This class may look more complex but it isn’t. The first line of the class is an attribute specifying that the file will be a fixed-length one. Then, each property has an attribute to specify the length. Finally, a special type converter is provided (TwoDecimalConverter) because the value in the file does not contain a decimal separator (but it could if you wanted too – it is just to show you how easy it is to use a converter).
And now you are ready to process a file with code like this:
'Clean the content of the listbox to display the new results Me.ListBox1.Items.Clear() ListBox1.Items.Add("Processing fixed-width file") 'create the parser engine providing the type (the class) to use Dim engine As New FileHelperEngine(GetType(cTemplateFixed)) 'Specifying the kind of error processing engine.ErrorManager.ErrorMode = ErrorMode.SaveAndContinue 'read the file Dim result As cTemplateFixed() = DirectCast(engine.ReadFile("TestFixedWidth.txt"), cTemplateFixed()) 'display some summary results ListBox1.Items.Add(" Number of records: " + engine.TotalRecords.ToString) ListBox1.Items.Add(" Successful: " + result.Length.ToString) 'report only if parsing errors are found If engine.ErrorManager.ErrorCount > 0 Then ListBox1.Items.Add(" Errors: " + engine.ErrorManager.ErrorCount.ToString) For Each err As ErrorInfo In engine.ErrorManager.Errors ListBox1.Items.Add(String.Format(" Error: {0} (row {1})", err.ExceptionInfo, err.LineNumber)) Next End If 'loop through each valid row For Each row As cTemplateFixed In result ListBox1.Items.Add("ID: " + row.Id.ToString) ListBox1.Items.Add(" Name: " + row.FullName) ListBox1.Items.Add(" Height: " + row.Height.ToString) ListBox1.Items.Add(" DOB: " + row.DOB.ToString) ListBox1.Items.Add(" Comment: " + row.Comment) Next
The only differences here, really, are the name of the class to use as a template and the name of the file to process. Everything else is exactly the same.
Creating a file
Using the very same class we created for our templates, we could just as easily output files. Check this snippet of code:
If File.Exists("temp.txt") Then File.Delete("temp.txt") 'Create a FileHelpers engine Dim engine As New FileHelperEngine(GetType(cTemplateFixed)) 'Create a list of objects to persist Dim arrObjects As New List(Of cTemplateFixed) 'fill the list of objects Dim item As New cTemplateFixed item.Id = 1001 item.FullName = "Eric Moreau" item.Height = 12345.67D item.Comment = "Very tall" item.DOB = New Date(1901, 2, 3) arrObjects.Add(item) 'write the file engine.WriteFile("temp.txt", arrObjects) MessageBox.Show("File created")
This code starts by creating an instance of the object exactly like we did before. We then create and fill a list of objects to finally use the engine to write the content of the collection into a file.
And much more!
This article only touches the tip of the iceberg. If you look at the downloadable sample, you will see that the delimited example also has attributes of the class definition to skip empty lines, to skip the first line, to ignore comments …
I really invite you to explore the documentation to find all the methods, the properties, and the attributes (there are not really many) to really be able to judge the full value of this library.
Conclusion
Did I tell you it is free (commercial and non-commercial use)? It is. There is no reason not to use whenever you have text files to process.
I have shown you here how to retrieve the file content in a collection of objects. There are also other methods from stream or string, or that return a datatable if this is what you need.
I don’t think this library will prevent you from processing your files! Just give it a try.