(Print this page)

SQL Pivot Tips and Tricks
Published date: Tuesday, February 20, 2018
On: Moer and Éric Moreau's web site

In the last few weeks, I have been busy creating a complex report for a client. The output itself is in Excel so he can continue the analysis by slicing and dicing the data even further then what I am providing.

To be able to provide data in an appropriate format, I needed to rely heavily of SQL Pivots. You can find a definition of pivot tables on WikiPedia.

This month article is about some tricks I needed to figure out to render the data.

No downloadable code this month

All the SQL code for testing is included in here. I haven’t created a downloadable file for it.

Feel free to just copy and paste the code from here.

To be able to execute the queries from this article, you will need to have access to a Microsoft SQL Server 2008 (or better). If you don’t have access to one and would like to test the feature, you could also give a try to SQL Fiddle.

Starting with a very simple example

Let’s assume you have a table with 3 columns (product, color, and quantity) and you want a pivot table summing the quantities by colors and products. Your query would like this:

DECLARE @table TABLE  (
	  Product VARCHAR(10)
	, Color VARCHAR(10)
	, Quantity	INT 
)

INSERT INTO @table ( Product, Color, Quantity )
VALUES 
  ( 'P1', 'Red', 1 )
, ( 'P1', 'Black', 11 )
, ( 'P1', 'Black', 1 )
, ( 'P1', 'Red', 55 )
, ( 'P3', 'Red', 3 )
, ( 'P3', 'Blue', 33 )
, ( 'P4', 'Red', 4 )
, ( 'P4', 'Black', 67 )

SELECT * 
FROM @table
PIVOT(SUM(Quantity) FOR Product IN ([P1], [P2], [P3], [P4], [P5])) AS Pivoted

Figure 1: Result of the simple query

If all the request to create pivot tables were that easy, it would just be to simple but it isn’t the case and you quickly get into results that you are not expecting.

Read on.

Your table has more columns than 3 columns

The first issue you might encounter is that your table on which you are pivoting has more columns than what you need in the output. If we reuse the previous query and add a Shipped column to the table, fill it with some random values like this:

DECLARE @table TABLE (
	  Product VARCHAR(10)
	, Color VARCHAR(10)
	, Quantity	INT 
	, Shipped	INT 
)

INSERT INTO @table ( Product, Color, Quantity, Shipped )
VALUES 
  ( 'P1', 'Red', 1, 101 )
, ( 'P1', 'Black', 11, 102 )
, ( 'P1', 'Black', 1, 103 )
, ( 'P1', 'Red', 55, 104 )
, ( 'P3', 'Red', 3, 105 )
, ( 'P3', 'Blue', 33, 106 )
, ( 'P4', 'Red', 4, 107 )
, ( 'P4', 'Black', 67, 108 )

SELECT * 
FROM @table
PIVOT(SUM(Quantity) FOR Product IN ([P1], [P2], [P3], [P4], [P5])) AS Pivoted

Notice here that the pivot query hasn’t changed at all. The generated output would contain many rows that you were probably not expecting.

Figure 2: Unexpected results

The reason is that any columns that are available are passed to the PIVOT function, so all apart from the column aggregated, and the column pivoted are implicitly grouped by.

The fix to that issue is very simple. You need to limit the columns passed to the pivot function which is easily done by using a sub-query:

SELECT * 
FROM (
	SELECT Product, Color, Quantity FROM @table
) AS Source
PIVOT(SUM(Quantity) FOR Product IN ([P1], [P2], [P3], [P4], [P5])) AS Pivoted

By running this new query, you would get exactly the same results as shown in figure 1.

Dynamic columns

I don’t know if you notice in the previous results (figure 1 and 2) but a column is shown for products P2 and P5 even if I don’t have any data for those products. There is apparently not much validation on column names. You could also easily omit one product name and the query wouldn’t complain.

This is the other issue that you might have. The columns to output (the product in my case), should really be dynamic. But they aren’t. At least not directly.

In order to get dynamic columns, you will need to rely on dynamic SQL code. The trick is to concatenate all product code in a string variable and use it later in your pivot query like this:

CREATE TABLE #table (
	  Product VARCHAR(10)
	, Color VARCHAR(10)
	, Quantity	INT 
)

INSERT INTO #table ( Product, Color, Quantity )
VALUES 
  ( 'P1', 'Red', 1 )
, ( 'P1', 'Black', 11 )
, ( 'P1', 'Black', 1 )
, ( 'P1', 'Red', 55 )
, ( 'P3', 'Red', 3 )
, ( 'P3', 'Blue', 33 )
, ( 'P4', 'Red', 4 )
, ( 'P4', 'Black', 67 )

DECLARE @columns AS NVARCHAR(MAX)
SELECT @columns = ISNULL(@columns + ',' , '') + QUOTENAME(Product)
FROM (SELECT DISTINCT TOP(100) PERCENT Product FROM #table ORDER BY Product) AS TheColumns

DECLARE @query AS NVARCHAR(MAX)
SET @query = 
	N'SELECT Color, ' + @columns + '
	FROM ( SELECT Product, Color, Quantity FROM #table ) AS Source
	PIVOT(SUM(Quantity) FOR Product IN (' + @columns + ')) AS Pivoted'

EXEC SP_EXECUTESQL @query

DROP TABLE #table

Notice that I had to transform my table variable (@table) to a temporary table (#table) to be able to use it from my dynamic SQL. If you rely on real tables, you won’t have that issue.

With this query in place, running the query will give you only the columns for existing products as shown in figure 3.

Figure 3: Results of the dynamic columns

Single column table

You might come up with a situation (it happened to me, it can happen to you!) in which you need to pivot a single column of data. The problem is that you normally have nothing to pivot against if you only have 1 column.

The trick here is to introduce a second dummy column, a simple Row_Number, just to get a second column.

This is the query that will be required (and noticed that we are still using the dynamic columns):

CREATE TABLE #table (
	  Product VARCHAR(10)
)

INSERT INTO #table ( Product )
VALUES 
  ( 'Product 1' )
, ( 'Product 3' )
, ( 'Product 4' )

DECLARE @columns AS NVARCHAR(MAX)
SELECT @columns = STUFF((SELECT ','+QuoteName(Row_Number() OVER (ORDER BY (SELECT NULL))) FROM #table FOR XML PATH('')),1,1,'')

DECLARE @query AS NVARCHAR(MAX)
SET @query = 
	N'SELECT *
	FROM ( SELECT Product, RowN = Row_Number() OVER (ORDER BY Product) FROM #table ) AS Source
	PIVOT(MAX(Product) FOR RowN IN (' + @columns + ')) AS Pivoted'

EXEC SP_EXECUTESQL @query

DROP TABLE #table

Figure 4: results of a single column table

Need to pivot on dates?

There is a little trick you need to know here if you want to pivot on years and months of dates you have in your data. You need to split them. Because the months are fixed, we don’t have to use dynamic columns. This is the kind of query you will need:

DECLARE @table TABLE (
	  Product VARCHAR(10)
	, OrderDate DATETIME
	, Quantity	INT 
)

INSERT INTO @table ( Product, OrderDate, Quantity )
VALUES 
  ( 'P1', '2018-01-01', 1 )
, ( 'P1', '2017-01-01', 11 )
, ( 'P1', '2017-12-31', 1 )
, ( 'P1', '2018-01-15', 55 )
, ( 'P3', '2018-02-14', 3 )
, ( 'P3', '2017-06-06', 33 )
, ( 'P4', '2017-12-31', 4 )
, ( 'P4', '2017-12-06', 67 )

SELECT * 
FROM (SELECT Product, YEAR(OrderDate) AS[Year], MONTH(OrderDate) AS [Month], Quantity FROM @table) AS Source
PIVOT(SUM(Quantity) FOR [Month] IN ([1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12])) AS Pivoted

Notice that the sub-query is using the Year and Month function to create 2 new fields from your date.

Figure 5: Pivoting on dates

Conclusion

Pivoting a table is an extremely useful feature of SQL. It is missing some features but known and documented workarounds make it very powerful.

If that feature is interesting for you, you might also want to read about the unpivot feature which does the reverse and is much simpler to use as there is no aggregation of data whatsoever.


(Print this page)