Moer and Éric Moreau's web site

(Print this page)

Microsoft SQL Server 2016 Temporal tables features

Published date: Thursday, July 21, 2016

On: Moer and Éric Moreau's web site

Original URL: http://www.emoreau.com/Entries/Articles/2016/07/Microsoft-SQL-Server-2016-Temporal-tables-features.aspx

Following my last month article on the new Native JSON support feature of SQL Server, I decided to continue the exploration of another new feature of SQL 2016.

This month, it will be about Temporal tables. This feature is also sometime referred to as a system-versioned table.

Downloadable code

The code from this month is limited to a SQL script that you can try against a SQL Server 2016 instance. It won’t work with previous versions.

From a VB/C# developer point-of-view, there is nothing different (other than a few SQL words you can use here and there) but nothing from the ADO.Net side.

The scripts of this article have been tested against the RTM version of the tool.

What is this new feature about?

This new feature provides built-in support historical data in a table without having to do all the manipulation yourself. Before being able to use this new feature, we will need to declare a special history table that will be fed automatically using some period fields we will define. These period fields need be declared with the datetime2 data type.

In short, a table will always contain the latest and greatest values while the historical data will be moved automatically to historical table associated with it. We will then be able to get the value at any point in time for a particular record. We will also be able to query just the current table of just the historical values if need be.

Why is this new feature interesting?

This feature becomes critical in a couple of scenarios. For example, you might need to keep an audit trail of the modification that were made to a table. You might want to go back in time for some data to be able to reproduce a report as of a prior date.

Since the beginning of the databases era, programmers and/or DBAs were required to find ways to provide a solution to that problem and it was never as efficient as what is now offered out-of-the-box!

Very often, programmers and DBAs who have successfully been able to create workarounds to this requirement, have some drawbacks in performance. The good news about the intrinsic feature is that it has been optimized for that kind of requests.

Another aspect that makes this feature interesting is that if you can easily starts collecting historical data without too much modifications to your database and no changes at all to your data manipulations (insert/update/delete) or your queries (select). When you want to query the historical data, you will need to add some elements to your queries.

Why is it not so interesting?

To me, it looks like Microsoft used the MVP approach. By MVP here I really mean the Minimal Viable Product.

I think (and this is my own opinion) that this feature is really great if you are handling live data.

For example, if you have a series of devices that send you the actual temperature every so often and you want to keep that temperature for each device into a table. Also part of your requirements is to store only the latest value for performance reason but would like to keep the historical values in another table. Well in that particular case, the temporal table feature is designed for you.

On the other side, if like me you are dealing with non-live data and for which the date is important, then you will have to keep your current ways of dealing with it. For example, if you want to keep track of the addresses of your clients (keeping the last one in the current table while maintaining the historical in an another table) that would work but because you cannot set the date intervals yourself (part of the current limitations), you cannot “fore date” or backdate any addresses change. Any updates to the addresses will be marked with current time stamps which in many cases do not reflect the reality. This would be called as “application-period temporal tables”. If this is the kind of feature you would prefer/need, I invite you to vote on this suggestion on Connect.

Also, but this is not a show stopper preventing from starting using this new feature, the designers are not ready yet for the feature. If you want to create a new temporal table or update an existing one to support that feature, you will need to prove your scripting skills!

Creating a temporal table

You don’t have to start with a fresh table. As long as your existing table as all the requirements (primary key, datetime2 columns, …), you can add the history to an existing table. And as I just said, you will need some scripting skills.

For the purpose of the demo, I will create a brand new table.

Figure 1: Adding a new temporal table

So if you right-click Tables and drill-down that path, you will find something to create a “system-versioned table”. Don’t you find it strange that there is only one option under “Temporal Table”. That can be a clue that other types of temporal tables are coming!

Anyway, if you click that option, don’t expect a designer. Instead, you will get the template of a script.

Instead of going through that script, here is the one I created for my demo:

CREATE TABLE dbo.Temperature
(
	EquipmentID INT NOT NULL PRIMARY KEY,
	Temperature NUMERIC (12,4),
	DateStart DATETIME2 GENERATED ALWAYS AS ROW START NOT NULL,
	DateEnd DATETIME2 GENERATED ALWAYS AS ROW END NOT NULL,
	PERIOD FOR SYSTEM_TIME (DateStart, DateEnd)
) WITH ( SYSTEM_VERSIONING = ON ( HISTORY_TABLE = dbo.TemperatureHistory ) )

Here are the main elements to get from this script:

The EquipmentID and the Temperature fields are just mine and not related to the Temporal feature
There is a primary key
DateStart and DateEnd are defined with a DateTime2 datatype
DateStart and DateEnd are defined with the “Generated Always As Row Start|End”
DateStart and DateEnd could have been named whatever you want them
The “Period For System_Time” specifies the 2 fields containing the dates (this is why your fields can have just any names)
The “With System_Versioning” is set to ON to start monitoring inserts, updates and deletes
The History_Table specify a name (otherwise a default name will be generated and you will not like it if you need to query the history!)

If you run this statement, and you refresh your Object Explorer, you will find the special TemperatureHistory table right under the parent table (with a special icon on the left to quickly identify temporal tables).

Figure 2: The new table seen from the SSMS Object Explorer

Also, if you right click that new table, you will not find the Design option because, as said previously, the designer is not ready for this feature. If you need to do any modifications to that table, you will need to use scripts once again.

If your scripting skills are really bad, you can fool SSMS by turning off the versioning. After a refresh of your Object Explorer, you will then be able to use the designer to modify your table definition. Once you are done, you need to turn back on the versioning. While versioning is off, your history table are shown just like any other tables.

You turn off and on versioning by simple scripts:

ALTER TABLE dbo.Temperature SET (SYSTEM_VERSIONING = OFF)
ALTER TABLE dbo.Temperature SET (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.TemperatureHistory))

Be very cautious with the changes you are doing to your tables. For example, if you are adding a field to your Temperature table, you also need to add it yourself to the TemperatureHistory table before being able to turning back on the versioning. Otherwise you will get the following error:

Msg 13523, Level 16, State 1, Line 76
Setting SYSTEM_VERSIONING to ON failed because table 'TestTemporal.dbo.Temperature' has 5 columns and table 'TestTemporal.dbo.TemperatureHistory' has 4 columns.

Inserting test data

Before being able to demonstrate the real strength of this feature, we need to insert some data to play with.

I start by inserting these 3 rows:

INSERT INTO dbo.Temperature ( EquipmentID, Temperature )
VALUES (1, 25.1111 )
, (2, 25.2222)
, (3, 25.3333)

At this point, if you query your tables, without any surprise you will find 3 rows in your current table while the history will be empty:

SELECT 'Temperature', * FROM Temperature ORDER BY EquipmentID
SELECT 'TemperatureHistory', * FROM TemperatureHistory ORDER BY EquipmentID, DateStart

Figure 3: Results of the queries

One thing to notice here are the DateStart and DateEnd fields that were automatically fed. BTW, the date and time you see in these fields are UTC time (so you probably will need to do some math to convert to local time if it is important). Don’t even try to set these fields yourself or else you will get this error:

Msg 13536, Level 16, State 1, Line 36
Cannot insert an explicit value into a GENERATED ALWAYS column in table 'TestTemporal.dbo.Temperature'. Use INSERT with a column list to exclude the GENERATED ALWAYS column, or insert a DEFAULT into GENERATED ALWAYS column.

Updating data

To see the real benefits of this feature, we need to start updating and/or deleting from that table.

In order to generates some random updates, I run this simple loop:

DECLARE @counter SMALLINT 
SET @counter = 1
WHILE @counter <= 20
BEGIN  
	UPDATE T
	SET Temperature = R.Temperature
	FROM dbo.Temperature AS T
	INNER JOIN (    
		SELECT CAST(RAND() * 3 AS INT)+ 1 AS EquipmentID, CAST( 25 + RAND() * 2  AS NUMERIC(12,4) ) AS Temperature
	) AS R
	ON R.EquipmentID = T.EquipmentID

    SET @counter = @counter + 1  

	WAITFOR DELAY '00:00:05'
END

This loop will run 20 times generating random temperature for a random equipment. It is also waiting 5 seconds between each iteration so I can have some room to do my queries later.

Nowhere in this query, I ask the table to persist the historical values. It is done automatically because we have declared everything we needed when creating the table.

At this point you should have 3 rows in the current table and 20 in the history table (because of the 20 updates).

Deleting data

Also to show better results, I delete one of the equipment:

DELETE FROM dbo.Temperature WHERE EquipmentID = 2

So if you followed me, I have inserted 3 records, then applied 20 random updates to finally delete 1 equipment.

The question is now: How many records will I find in my tables? The answer is 23. The current table contains 2 equipment (id 2 has been deleted) but the history still contains the values for that equipment because it was existing at some point.

The data we have so far

You remember the 2 SELECT queries here above:

SELECT 'Temperature', * FROM Temperature ORDER BY EquipmentID
SELECT 'TemperatureHistory', * FROM TemperatureHistory ORDER BY EquipmentID, DateStart

If you run them again, you will get data similar to this:

Figure 4: Data after 24 operations

The first thing you can see is that the current table always contain only the current data. The history table contains all the values that have been updated/overridden by any Update queries. It also contains all the data from the deleted equipment.

The other thing you can see is that we can query specifically the current or the historical table.

And finally, so far, nothing has been added to the queries to specify that it was a temporal table.

Querying historical data

A couple of keywords have been added to be able to correctly query the historical data. Those are:

As Of datetime
From startdatetime To enddatetime
Between startdatetime And enddatetime
Contained In (startdatetime, enddatetime)
All

Using these keywords after “For System_Time” which indicates that we are interested in the rows of the versioning system (instead of the just the current data table), I can then create queries that are fully taking advantage of this new feature.

Querying historical data – As Of

For example, if I want to know the state as of a specific point in time, I can run this query:

SELECT * FROM dbo.Temperature FOR SYSTEM_TIME AS OF '2016-07-20 19:40:40' ORDER BY EquipmentID

How many rows will be returned? Remember equipment 2 is deleted but it was deleted after 7.40:40PM. So the answer is 3. As of the time that the query wants to see the results for, equipment 2 was still existing.

Figure 5: Result of the previous query

Querying historical data – From To

Now if we are interested in all the variations for a time range, we could possibly union the current with the history table but Microsoft has created a syntax for that:

SELECT * FROM dbo.Temperature FOR SYSTEM_TIME FROM '2016-07-20 19:40:40' TO '2016-07-20 19:41:09.4080004' ORDER BY EquipmentID, DateStart

Figure 6: Result of the previous query

That query will show you any rows where your FROM and TO date are somewhere between your DateStart and DateEnd.

Querying historical data – Between

The difference to the previous is very subtle here.

If I would have set the end date to 2016-07-20 19:41:09 (without the milliseconds), I would have get the exact same result. But since I have been very specific about the time (including the milliseconds), the Between keyword includes rows that became active on the upper boundary.

So that query which contains exactly the same boundaries as the previous section returns 1 additional row:

SELECT * FROM dbo.Temperature FOR SYSTEM_TIME BETWEEN '2016-07-20 19:40:40' AND '2016-07-20 19:41:09.4080004' ORDER BY EquipmentID, DateStart

Figure 7: Result of the previous query

Querying historical data – Contained In

And sometime what we really want to see is everything that both DateStart and DateEnd are contained inside the provided boundaries. This is exactly what the CONTAINED IN query will return:

SELECT * FROM dbo.Temperature FOR SYSTEM_TIME CONTAINED IN ('2016-07-20 19:40:40' , '2016-07-20 19:41:09.4080004') ORDER BY EquipmentID, DateStart

Figure 8: Result of the previous query

Querying historical data – ALL

This last syntax simply returns a UNION ALL of both the actual data and the historical data:

SELECT * FROM dbo.Temperature FOR SYSTEM_TIME ALL ORDER BY EquipmentID, DateStart

And it comes with some limitations

As all the new features, this one comes with some limitations. Microsoft is pretty good at smoothing those limitations in following versions of the tool. But for now, when using SQL Server 2016, we have these limitations (some already listed here above):

The actual data table must have a primary key
The truncate table statement is not supported when system_versioning has been turned on
Replication is limited
History tables do not support constraints like primary keys, foreign keys
Insert and Update statements cannot reference the period columns (system_time)
Modification of data is not supported in the historical table
Table designer not available
FileTables are not supported

Conclusion

Is this the implementation I was personally wishing for that new feature? Not exactly.

Does that mean that this new feature is totally useless? Of course not. It can be useful in some situations.

As we have seen with other features like the column store index, Microsoft often gives us some features that seems incomplete for some people but are totally useful for others. To me, it looks like the first version of a very cool feature that will receive a lot of love in the next iterations. Remember, a new Microsoft product/feature is very often good only at the 3rd iteration!

(Print this page)