I really love this one. We often have to transform a structure of data (a DataTable in my example) into another format. In those scenarios, we often start looping through our data source and concatenate values using a string builder. You will surely admit that you have done this thing way too often.
This month’s trick ends the long cryptic concatenation of strings (using string builder).
Using LINQ (specifically LINQ-to-Dataset in my case), and also XML Literals in some cases, I will show you how to generate CSV, HTML and XML outputs from your data source.
You can reuse the core of this article to build different output just as easily.
No C# code this month!
It is not because I don’t like you C# developers, it is just that an important feature is still missing from C#. I just named XML Literals. Many C# developers had requested it and it is supposed to be included into the next version (C# 4.0).
The CSV output is not using XML Literals so you could re-use that one easily.
But .Net 3.5 is required here!
To be able to use this code, you will need to use Visual Basic 2008 because LINQ and XML Literals that will be used here are part of the .Net Framework 3.5.
Figure 1:The demo application in action
We need a data source
For the sake of demoing this feature, it is a lot easier for me to simply create an in-memory DataTable. The 3 transformation we will then do will all start from this very simple DataTable. If you read some of my other articles, you might recognize this table (from which I made a huge cleanup because I don’t as much data here).
This is the CreateDataset method:
Private Function CreateDataSet() As DataSet Dim dsData As New DataSet("TestDataSet") Using dt As New DataTable("States") dt.Columns.Add("ID", GetType(Integer)) dt.Columns.Add("Code", GetType(String)) dt.Columns.Add("Description", GetType(String)) dt.Columns.Add("Population", GetType(Integer)) dt.PrimaryKey = New DataColumn() {dt.Columns("ID"), dt.Columns("IDParent")} dt.LoadDataRow(New Object() {1, "AB", "Alberta", 0}, True) dt.LoadDataRow(New Object() {2, "BC", "British Columbia", 0}, True) dt.LoadDataRow(New Object() {3, "MB", "Manitoba", 1162800}, True) dt.LoadDataRow(New Object() {4, "NB", "New Brunswick", 750600}, True) dt.LoadDataRow(New Object() {5, "QC", "Québec", 7487200}, True) dsData.Tables.Add(dt) End Using Return dsData End Function
Generate a CSV output
Like it or not, CSV files are still strongly used in many applications. Many companies haven’t switch to the XML “paradigm” yet.
So this first transformation I will show you is how to transform your DataTable into a CSV string.
The first thing we need to build is our DataTable. To build it, we just have to call the marvellous CreateDataset method as shown here:
Dim ds As DataSet = CreateDataSet()
So here is the LINQ query I built:
Dim StatesTable = _ From state In ds.Tables("States").AsEnumerable _ Order By state("Population") _ Select state("Code").ToString + ", " + _ state("Description").ToString + ", " + _ state("Population").ToString
Now that we have the LINQ query in the StatesTable variable, we can run it to get a collection of string. All we have to do, in my simple scenario, is to concatenate them all together. The simplest method is to use the String.Join method as shown here:
MessageBox.Show(String.Join(Environment.NewLine, StatesTable.ToArray), "Output to CSV")
Figure 2:The CSV output in a MessageBox
A very useful trick I found lately is that we can embed conditions into the LINQ query. Say for example that instead of showing a population value of 0, I want to show the value “unknown”. To do it, all you need to do is to replace the last line of the LINQ query with this one:
If(Convert.ToInt32(state("Population")) <= 0, "unknown", state("Population").ToString)
IF(Condition, Value if True, Value if False)
A short note on the syntax
Instead of specifying your field name between (“”) like I did in the previous example, you can also use the Bang operator (!) like this:
Dim StatesTable = _ From state In ds.Tables("States").AsEnumerable _ Order By state!Population _ Select state!Code.ToString + ", " + _ state!Description.ToString + ", " + _ If(Convert.ToInt32(state!Population) <= 0, "unknown", state!Population.ToString)
About the casting of the fields, there is yet another method which reads like this (see the Population variable on the last line):
Dim StatesTable = _ From state In ds.Tables("States").AsEnumerable _ Order By state!Population _ Select state!Code.ToString + ", " + _ state!Description.ToString + ", " + _ If(state.Field(Of Int32)("Population") <= 0, "unknown", state!Population.ToString)
Generate a XML output
If you think of it, creating a XML output instead of a CSV output is simply a matter of modifying the string concatenation of the SELECT clause of the LINQ query. That would be ugly but that would work. But wait, there is another coolest method. I just named: XML Literals.
Most of what we have done in the previous section is still valid. The only thing that will change is the LINQ query that will now read like this:
Dim StatesDoc = _ <States> <%= From state In ds.Tables("States").AsEnumerable _ Order By state("Population") _ Select <State><Code><%= state("Code") %></Code> <Description><%= state("Description") %></Description> <Population><%= If(Convert.ToInt32(state("Population")) <= 0, _ "unknown", _ Convert.ToInt32(state("Population")).ToString("#,##0")) _ %></Population> </State> %> </States>
The other thing worth mentioning is that now when the StatesTable query runs, it returns a real XML Document (not even a string!). Because it is a document and not a string collection, you can just call the ToString method to display it in the message box like this:
MessageBox.Show(StatesTable.ToString, "Output to XML")
Figure 3:The XML output in a MessageBox
Excellent. Fantastic.
I know that the DataSet provides methods to transform it into XML but here we have more than just transformation. We have sorted and converted the population value. We could have also filtered and selected only some fields, joined with another data source, ... Can the methods of the DataSet do all that? Plus the fact that your data source is not always a DataSet! I hope you are convinced.
Generate an HTML Output
Seriously, are you expecting big differences here? Does HTML really differ from XML? Not at all. Simply add a longer header to define columns directly inside the LINQ query and you are done.
My LINQ query now reads like this:
Dim StatesTable = _ <table> <tr><th>Code</th> <th>State</th> <th>Population</th> </tr><%= From state In ds.Tables("States").AsEnumerable _ Order By state("Population") _ Select <tr><td><%= state("Code") %></td> <td><%= state("Description") %></td> <td><%= If(Convert.ToInt32(state("Population")) <= 0, _ "unknown", _ Convert.ToInt32(state("Population")).ToString("#,##0")) _ %></td> </tr> %> </table>
Figure 4:The HTML output in a MessageBox
Be careful
I have replaced some code I have in a real application that generates HTML content the application sends by e-mail.
While I was converting, I found that the XML syntax is very strict. Consider this short example:
Dim StatesTable2 = _ <table> <tr> <th>long<br>title</th> </tr> </table>
Dim StatesTable2 = _ <table> <tr> <th>long<br/>title</th> </tr> </table>
Conclusion
XML Literals was a great addition to VB9. So great that even C# developers now want it.
Mixing that features to your favourite LINQ flavour can generate all kind of output. This is what I just demonstrate.
I seriously hope that you will give it a try. It is really worth it.