As the notion of time is fundamental to our perception of the world, it’s only normal that many real world databases are temporal databases: databases that deal explicitly with the temporal dimension of facts.
In the following location table, row 2 says that Silvio has been (or will be) in Rome “sometime”.
id |
person |
place |
1 |
Silvio |
Milan |
2 |
Silvio |
Rome |
To record the fact that I’ve been in Rome from Jan 10th 2011 to Jan 12th 2011, I need more columns.
id |
person |
place |
valid_from |
valid_to |
1 |
Silvio |
Milan |
1900-01-01 |
2011-01-10 |
2 |
Silvio |
Rome |
2011-01-10 |
2011-01-12 |
3 |
Silvio |
Milan |
2011-01-12 |
9999-01-01 |
On row 2, the new columns valid_from, and valid_to store the period in which the relation (‘Silvio’, ‘Rome’) was true. Now “location” records the past and future history of where and when I’ve been. You see I don’t travel much.
The dates ‘1900-01-01’ and ‘9999-01-01’ stand for negative infinity and positive infinity. We conventionally use only dates greater than ‘1900-01-01’ and less than ‘9999-01-01’ to mean actual dates (in this db).
“Negative infinity” doesn’t have to be the beginning of time. It’s as back in time as the system cares. (The same is true for “positive infinity” of course.)
I could have chosen other dates, it depends on what you are modeling.
I don’t use NULL to represent infinite values because they would complicate the queries on “location”.
Compare the following two versions of “where was Silvio on October 20th 2011?” and keep in mind that it’ll only get worse with “real” queries (… and don’t use * in your “real” production queries).
-- 1. no nulls
select * from location
where
person = 'Silvio' and
valid_from <= '2011-10-20' and '2011-10-20' < valid_to
-- 2. with nulls
select * from location
where
person = 'Silvio' and
(valid_from <= '2011-10-20' or valid_from is null) and
('2011-10-20' < valid_to or valid_to is null)
Also, since I haven’t used NULL to represent infinity, if I want, I can still use NULL for what it’s meant: “information unknown”. (but I wouldn’t).
You have noticed that the value of “valid_to” of one row is always the same as the value of “valid_from” of the next row.
Resist the urge to “normalize” the schema, e. g. removing the “valid_to” column. If you do remove it, say goodbye to nice, simple queries: hello sql inferno!
To summarize the temporal design features of “location” table, we can say that “location” is a valid-time table.
Now, you’ll think that this modeling of valid-time tables is really trivial. I agree about the simplicity. Simple designs are the holy grail of programming.
If you ever maintained an app with a temporal db, you may have observed that surprisingly simple things like valid-time table end up messily implemented.
Also there are some non-trivial implications about introducing the time dimension in a table: think about the identity of your entities and you unique keys. Think about enforcing integrity constraints on a valid-time table… Lots of fun here 🙂
We haven’t even scratched the surface of temporal databases yet. I am convinced that this topic is absolutely crucial for any enterprise (that I can think of) but still doesn’t usually get the attention it deserves.
I end this introductory post with a couple of references.
A few months ago I found an excellent book on the subject of temporal-db.
Developing Time-Oriented Database Applications in SQL
Richard T. Snodgrass
Morgan Kaufmann Publishers, Inc.
San Francisco, July, 1999
504+xxiii pages
ISBN 1-55860-436-7
You can download it from the author’s page tdbbook.pdf
I can only thank the author for his generosity.
I’m currently reading this book, so, if you find anything valuable in my posts about temporal-dbs, you probably should just thank Mr. Snodgrass. Obviously any errors are mine.
While I enjoy writing down my thoughts, I only hope to provoke your curiosity. On any topic, if you’re serious about it, I invite you to take a look at the sources and do some research of your own. (research as in googling, not as in university).
Wikipedia is also a great resource (as ever):
http://en.wikipedia.org/wiki/Valid_time