T-SQL Tuesday #81 - Going back to the classics. Snodgrass’ book on temporal databases
There is a really awesome book about SQL Server you heard about and you decided to buy it. Plan to sit down and read the book. Take a nugget or two from the book and tell us how you can use that nugget of information within your personal or professional environment.
I have chosen to blog about Richard T. Snodgrass’ book “Developing time-oriented database applications in SQL.” I heard about this book last year around this time, when I started to investigate the new temporal feature “System versioned tables” in SQL Server 2016. I believe it was my old colleague Peter Gram from Miracle who pointed out the book to me, and usually when Peter recommends a book, I buy it and (eventually) read it. It was also about time (no pun intended), since I’m giving a talk on “All things time-related” for two SQL Saturdays during the next few months, and I needed to spice up the presentation with some new material.
In this blog post, I will quickly scratch down a few of the takeaways, I have taken from the book already.
First of all, Snodgrass introduces terminology for things in the world of temporal databases. A temporal database is a database in which we want to track the state of instances over time. A database schema is not necessarily temporal just because it has one or more tables with attributes of type DATE or DATETIME. The value of a birthday attribute for instance, doesn’t really change over time.
A table can track the state of an instance in either valid-time, transaction-time or a combination of the two, called bi-temporal. And that is the topic of my T-SQL Tuesday post this month…
Valid-time state tables
In a valid-time state table, we as users chose the period, where values of the attributes of an instance is valid. Let’s say that we have a table
with people, and a another table to track where people live:
PERSON_ADDRESS(PERSON_ID, ADDRESS1, ADDRESS2, ZIPCODE, CITY)
We can now add two DATE or DATETIME attributes VALID_FROM and VALID_TO to the table
PERSON_ADDRESS(PERSON_ID, ADDRESS1, ADDRESS2, ZIPCODE, CITY, VALID_FROM, VALID_TO)
to be able track moving patterns.
When the temporal semantics of the table is valid-state, we as database developers control (and have responsibility) of the values of the VALID_FROM and VALID_TO attributes. We are allowed to insert dates in the future, indicating that we already know that a person is moving next week. And we are allowed to insert data from the past, if we get to know that a person lived somewhere last year. We must ensure that there are no holes in the history set (my name for the set of rows sharing the same value of the primary key of the table):
Inserting a row in the history set will involve updating the value(s) of VALID_TO attribute(s) of other rows in the history set. I have described this in more detail in this post on data warehouse history (type 2) here:
Transaction-time state tables
Transaction-time state tables are what is implemented in “System-versioned tables” in SQL Server 2016. Here we also have two attributes START_DATE and STOP_DATE (names can be changed, of course) indicating the temporal validity of a row from the history set. But now the RDBMS controls the values of these attributes. The value of the START_DATE is set using transaction time, and the RDBMS takes care of the book keeping of aligning START_DATE and STOP_DATE. Since history attribute values are controlled using transaction time, we can no longer make temporal updates in the past or the future. This is also why I don’t think that the system-versioned tables feature is suitable for data warehouse T2 history (see link to other blog post below).
Bi-temporal state tables
Combining the two approaches above, we get bi-temporal state tables. Here we have to pairs of temporal attributes: (VALID_FROM, VALID_TO) and (START_DATE, STOP_DATE) on our person address table:
PERSON_ADDRESS(PERSON_ID, ADDRESS1, ADDRESS2, ZIPCODE, CITY, VALID_FROM, VALID_TO, START_TIME, END_TIME)
In a bi-temporal state table schema, we can now express things like “what was the history set for Bill on August the 1st, 2016?” And in applications, where you need to be able to redo a calculation, you did in the past, bi-temporal state tables are a pretty nice thing to use.
That’s it for this months T-SQL Tuesday. Thanks for reading along… Later on during the next months, I will follow up with some more detailed posts on this exciting topic.
Download your own free copy of Snodgrass’ book here:
Why I don’t think that the system-versioned tables feature is suitable for data warehouse T2 history:
Read more about T-SQL Tuesday #081 here: