(Print this page)

Easily convert a CSV into a DataTable
Published date: Thursday, April 23, 2020
On: Moer and Éric Moreau's web site

Over the years, I wrote on that topic a few times already, each time using a different library.

Now I want to show you how to open a CSV file directly into a DataTable without using any library. Just by using plain-old OleDb intrinsic objects from the ADO.Net library.

Why DataTable? Because this object is especially useful and offers a lot of built-in features like filtering, sorting, binding, … And on top of all, this is surely an object that is awfully familiar to you!

Available source code

Both VB and C# projects are provided this month.

The solution was created using Visual Studio 2019 but should also work in previous versions.

Creating the UI

As always, the UI is not where I want to spend time because you surely have your own UI in which you will try to incorporate that feature.

Figure 1: The demo application in action

As shown in figure 1, the demo UI offers a textbox so you can specify the file you want to load, a checkbox to specify if the file as a header row or not, a button to load the file, and a datagrid to display the results.

Header row?

It is extremely easy to specify whether your file has a header row or not. The Extended properties argument lets you add the HDR argument for which you give a value of Yes or No. You will see that argument being filled in the demo below.

Different delimiter?

Handling the presence of the header is easy. What about the delimiter now? If your file is using the comma as the delimiter, you are already in business as it is the default value. But some files can have a semi-colon or a tab as their delimiter or another even weirder character. The Extended properties does not let you specify that delimiter value. But everything is not lost!

If you create a schema.ini file in the same folder as your CSV file, you can easily set the desired delimiter and even the presence of the header row. This schema.ini file, if it contains a section with a title of the same name as your file will be processed automatically.

In my demo application I have created this schema file for the tab-delimited file and the semi-colon delimited file:

[CitiesSemiColon.CSV]
ColNameHeader=True
CharacterSet=ANSI
Format=Delimited(;)

[CitiesTabDelimited.CSV]
ColNameHeader=True
CharacterSet=ANSI
Format=TabDelimited

Notice that the checkbox for the header row does not have effect anymore as the value is read from the schema file.

With this schema file in the same folder as your CSV file, your grid should now read correctly.

There is event more options that you can set using that file. Have a look at https://docs.microsoft.com/en-us/sql/odbc/microsoft/schema-ini-file-text-file-driver?view=sql-server-ver15 for the official documentation.

The code

All that being said, I am sure you want to see what the code looks like. The most important snippet of code is a method name LoadCsvFile.

It starts by splitting the folder and the filename from the full path that is received. This is because the folder is faking the database (value provided to the Data Source property of the connection string). The filename is faking the table name in the command object.

Then, Ado.Net objects are created: a connection, a command, an adapter. The adapter fills a dataset. There is nothing special here other than the connection string specifying the OleDb provider.

Finally, the first table of the dataset is returned to the caller method. That datatable is now fully standard. It does not know it is coming from a CSV file. The datatable can be sorted, filtered, … as you would do with any other datatable object.

This is the most important snippet of code:

Private Function LoadCsvFile(ByVal pFileName As String, ByVal pHasHeader As Boolean) As DataTable
    'split the path from the filename
    Dim strPath As String = IO.Path.GetDirectoryName(pFileName)
    Dim strFile As String = IO.Path.GetFileName(pFileName)

    'Create a connection object
    Using objConn As New OleDbConnection($"Provider=Microsoft.Jet.OLEDB.4.0;Data Source={strPath};Extended Properties=""Text;HDR={If(pHasHeader, "Yes", "No")};""")
        objConn.Open()
        'create a command object
        Using objCmdSelect As New OleDbCommand($"SELECT * FROM {strFile}", objConn)
            'create an adapter object
            Using objAdapter1 As New OleDbDataAdapter()
                objAdapter1.SelectCommand = objCmdSelect
                'fill the dataset using the adapter
                Dim objDs As New DataSet()
                objAdapter1.Fill(objDs)
                'return the datatable
                Return objDs.Tables(0)
            End Using
        End Using
    End Using
End Function
private DataTable LoadCsvFile(string pFileName, bool pHasHeader)
{
    // split the path from the filename
    string strPath = System.IO.Path.GetDirectoryName(pFileName);
    string strFile = System.IO.Path.GetFileName(pFileName);

    // Create a connection object
    string strHeader = pHasHeader ? "Yes" : "No";
    using (OleDbConnection objConn = new OleDbConnection($"Provider=Microsoft.Jet.OLEDB.4.0;Data Source={strPath};Extended Properties='Text;HDR={strHeader};'"))
    {
        objConn.Open();
        // create a command object
        using (OleDbCommand objCmdSelect = new OleDbCommand($"SELECT * FROM {strFile}", objConn))
        {
            // create an adapter object
            using (OleDbDataAdapter objAdapter1 = new OleDbDataAdapter())
            {
                objAdapter1.SelectCommand = objCmdSelect;
                // fill the dataset using the adapter
                DataSet objDs = new DataSet();
                objAdapter1.Fill(objDs);
                // return the datatable
                return objDs.Tables[0];
            }
        }
    }
}

My demo files

I have included 4 demo files in my projects (CSV with a header row, CSV without a header row, semi-colon delimited, tab delimited). They are all included in the project to ease the packaging of the demo and have their “Copy to Output Directory” property set “Copy always” just to ease the initial feed of the textbox when the demo application starts.

I did not create myself the demo files. I found it on https://people.sc.fsu.edu/~jburkardt/data/csv/csv.html. I have modified the file to remove the header or change the delimiter but that was it. I did not feel like creating it all by myself!

Drivers needed and x86 platform only!

You also need to install the Microsoft Access Database Engine 2010 Redistributable from https://www.microsoft.com/en-us/download/details.aspx?id=13255. You will also need to target the x86 platform (or the AnyCPU platform).

Failing any of these 2 requirements will lead you to the exception “The 'Microsoft.Jet.OLEDB.4.0' provider is not registered on the local machine”.

Conclusion

Yet another way of reading a CSV file. You have plenty of options for this common task. I figured that this one is a quick and easy one to use.


(Print this page)