PostgreSQL supports two types of timestamps, one with and one without time zone. Contrary to what the name suggests,
timestamp with time zone does not store the time zone. The difference between the two types lies in the semantics and is often a source of confusion.
In order to understand the difference between the two timestamp types, one needs have a basic understanding of time zones. At least for me, this helped in making sense of the difference. Thus, I’ll first take a short detour and talk a bit about time and time zones before diving into times in PostgreSQL. Feel free to jump directly to another section.
- The Concept of Time
- Timestamps in PostgreSQL
- Why Two Types Of Timestamps?
- Working With Timestamps
- A Valid Use Case For
timestamp without time zone
The Concept of Time
Time is the indefinite continued progression of existence and events that occur in apparently irreversible succession from the past through the present to the future.
I like to think of time as a continuous timeline representing the absolute notion of time.
What does absolute mean in this context? I’ll try to explain it with an example. When daylight saving changes from summer to winter time it causes the clock to be set back by one hour1. We gain an additional hour on this day. Since we don’t have clocks with 25 hours, a certain hour (usually the third) is repeated, i.e., the clock will show 2:45 twice on this day.
Does that mean we traveled back in time? Of course not! Time steadily progresses along the continuous timeline. The only thing that changed is how the absolute time is mapped to the time we read on a clock, also known as wall time.
We keep track of the absolute time in a special time zone called Coordinated Universal Time, or short UTC. It does not observe daylight saving time and thus can represent the continuous notion of time.
We can denote other time zones by offsets to UTC, such as
-10:30. Further, there are named time zones like
Europe/Berlin. Each such named time zone defines a set of rules denoting which UTC offset applies at any given moment in time. These rules for instance encode daylight saving periods or historical adjustments of the time zone.
The time zone database, maintained by Paul Eggert, is available at IANA (or on GitHub). It contains all time zones with their rules. I encourage you to check it out. You might find some interesting historical facts about your local time zone in the comments, like this one:
One further detail for Switzerland, which is probably out of scope for most users of tzdata: The Europe/Zurich zone describes all of Switzerland correctly, with the exception of the Canton de Genève (Geneva, Genf). Between 1848 and 1894 Geneva did not follow Bern Mean Time but kept its own local mean time. To represent this, an extra zone would be needed.
To summarize, we can think of time zones as functions that uniquely map absolute time to local time, i.e. the wall time.
The same goes the other way around. Knowing the time zone, we can convert any wall time to an absolute time, i.e. UTC. There is one caveat: certain wall times don’t uniquely map to an absolute time, they’re ambiguous. We can see this in the graph above where, for time zone B, certain y-values have multiple x-values. This is the case when transitioning to winter time where a determined hour is repeated.
Timestamps in PostgreSQL
The two timestamp types available in PostgreSQL are
timestamp without time zone and
timestamp with time zone. For brevity’s sake, I’ll be using their shorthand versions
I’ve started using PostgreSQL through an ORM, specifically Active Record, the built-in ORM in Ruby on Rails. Active Record takes care of converting a timestamp from the application’s time zone to UTC before storing it in the DB. It is generally considered a good practice to store timestamps in UTC, otherwise you’ll run into problems related to daylight saving changes. Active Record uses
timestamp columns to store timestamps.
Only having known
timestamp, I was excited when I first encountered
timestamp with time zone, or
timestamptz. I thought to myself: cool, I can store a timestamp along with its time zone. So I started playing in the console:
db=# SELECT '2016-01-01 00:00+10'::timestamptz; timestamptz ------------------------ 2015-12-31 15:00:00+01 (1 row)
Hmm, that was not what I expected. Why didn’t it return
2016-01-01 00:00+10 as I had input? Clearly, the time zone
+10 is not being stored. Instead, it is converting it to my computer’s local time zone
So I started reading the docs and learning more about time zones. Specifically, the PostgreSQL documentation on time zones says:
All timezone-aware dates and times are stored internally in UTC.
The PostgreSQL source confirms this that
timestamptz is stored as a primitive value, seemingly indistinguishable from
#ifdef HAVE_INT64_TIMESTAMP typedef int64 Timestamp; typedef int64 TimestampTz; #else typedef double Timestamp; typedef double TimestampTz; #endif
With a better understanding of time zones, I realized that there is actually no point in storing a timestamp’s time zone. After all, the timestamp denotes an absolute point in time and can thus be converted from UTC back to any time zone for display purposes. That’s exactly what PostgreSQL does:
[Timezone-aware dates and times] are converted to local time in the zone specified by the TimeZone configuration parameter before being displayed to the client.
This explains why in the example before the
timestamptz was rendered not in UTC, but in the time zone set by the
db=# SHOW timezone; TimeZone --------------- Europe/Zurich (1 row) db=# SELECT '2016-01-01 00:00+10'::timestamptz; timestamptz ------------------------ 2015-12-31 15:00:00+01 (1 row)
Why Two Types Of Timestamps?
You might be asking yourself, what is the point of having an additional timestamp type if it is stored the same way?
The answer is semantics. Whenever you encounter a
timestamptz, you know it denotes absolute time. It was stored in UTC. However, when you come across
timestamp, you can’t possibly know in what time zone the timestamp is just by looking at it. It is merely a wall time. Maybe it’s UTC, or maybe the developer stored it in local time. Without additional context, you can’t be sure.
Hence, we should always be using
timestamptz whenever we denote an absolute point in time.
Working With Timestamps
In this section I’ll show various things that can be done with timestamps in PostgreSQL. We’ll assume a default timezone configuration of
UTC in the examples.
Changing The Timezone Configuration
Whenever I connect to a remote database and look at timestamps, I have to mentally translate them from UTC to my local time zone in order to find out how long ago something happened. I’m based in Zurich, which has a UTC offset of one or two hours, depending on the time of year, further complicating things.
As shown earlier,
timestamptz values are shown in the timezone defined by the
timezone configuration. This configuration can be set on a per session basis. This means that I can simply run
SET timezone TO 'Europe/Zurich';
timestamptz values will be printed in my time zone. I don’t have to do the mental mapping anymore.
db=# SELECT now(); now ------------------------------- 2016-05-29 19:20:38.505126+00 (1 row) db=# SET timezone TO 'Europe/Zurich'; SET db=# SELECT now(); now ------------------------------- 2016-05-29 21:20:56.619098+02 (1 row)
timezone configuration has another effect. When parsing a timestamp that has no time zone designator (e.g.
±hhmm), it will be assumed to be local to the currently set
db=# SET timezone TO 'US/Pacific'; SET db=# SELECT '2016-01-01 00:00'::timestamptz; timestamptz ------------------------ 2016-01-01 00:00:00-08 (1 row) db=# SELECT '2016-01-01 00:00Z'::timestamptz; timestamptz ------------------------ 2015-12-31 16:00:00-08 (1 row)
The first example had no designator, thus the timestamp is parsed as
US/Pacific, internally converted to
UTC, and then converted back to
US/Pacific for display. The second example had an explicit time zone, namely
Z denoting UTC.
On the other hand, when casting a string to a
timestamp, any time zone information present is ignored:
db=# SELECT '2016-01-01 00:00-10'::timestamp; timestamp --------------------- 2016-01-01 00:00:00 (1 row) db=# SELECT '2016-01-01 00:00+04'::timestamp; timestamp --------------------- 2016-01-01 00:00:00 (1 row)
Converting Between Timezones
There are two ways of converting a timestamp to another time zone. One is using an expression, the other a function. The function
timezone(zone, timestamp) is equivalent to the SQL-conforming construct
timestamp AT TIME ZONE zone.
Let’s convert the timestamp
2016-01-01 00:00 to
db=# SELECT timezone('US/Pacific', '2016-01-01 00:00'); timezone --------------------- 2015-12-31 16:00:00 (1 row)
We get the wall time in California for
2016-01-01 00:00 UTC. Note that we passed the timestamp as a string, which was implicitly cast to a
timestamptz. Since no explicit time zone information was contained in it, it was parsed in the time zone set by the
timezone config (
UTC in this case). Generally it is better to be explicit by specifying a time zone offset and by manually type casting, i.e., by writing
What happens when we instead cast the string to a
db=# SELECT timezone('US/Pacific', '2016-01-01 00:00'::timestamp); timezone ------------------------ 2016-01-01 08:00:00+00 (1 row)
Rather than subtracting 8 hours, as it did in the previous example, it surprisingly added 8 hours. Let’s look a bit closer at the output to understand what is going on.
In the first example, where we passed a
timestamptz to the conversion function, the return value was
2015-12-31 16:00:00. In the second example, we passed a
timestamp to the function, which gave us
2016-01-01 08:00:00+00. Can you spot the difference? The first output has no time zone information while the latter does.
This means that when we converted a
timestamptz we got back a
timestamp, whereas when we converted a
timestamp, it gave us a
timestamptz. In other words, in the first example we converted an absolute time to a wall time whereas in the second example we did the inverse operation, i.e. we converted a wall time back to an absolute time.
In reality, there are multiple
timezone functions, one that takes
timestamp and one that takes
timestamptz2. I like to think of them as follows:
When passing a time zone
xand a value
timezone(x, t), I’m basically saying, give me the time the clock was showing in time zone
xat the absolute point in time
When passing a time zone
xand a value
timestamp, I’m saying, give me the absolute time for when clocks in time zone
Thanks to the two distinct timestamp types, one denoting absolute and the other wall time, we don’t have to specify from which to which time zone we want to convert, as is the case e.g. in MySQL’s
mysql> SELECT CONVERT_TZ('2004-01-01 12:00:00','GMT','MET'); -> '2004-01-01 13:00:00'
Time Zone Aware Arithmetics
A day doesn’t always have 24 hours. As noted earlier, when daylight saving changes occur, the day can have 23 or 25 hours. What if you need to get all hours for any given day in a certain time zone, e.g. to calculate hourly statistics? This is where UTC or
timestamptz helps us.
We can generate all hours between two
timestamp values (or
timestamptz) as follows:
SELECT generate_series( '2016-06-01 00:00'::timestamp, '2016-06-01 23:00'::timestamp, '1 hour'::interval );
This returns us 24 timestamps. What about days in which daylight saving changes occur? If we don’t take the time zone into account, PostgreSQL will always return 24 hours.
Let’s take as an example the US Pacific time zone. In 2016, summer time started on March 13 and winter time is going to start on November 6. In other words, March 13 had 23 hours while November 6 will have 25 hours. To get the correct number of hours, we first convert the wall times to absolute times and then generate the series. The above query thus becomes:
SELECT generate_series( timezone('US/Pacific', '2016-06-01 00:00'::timestamp), timezone('US/Pacific', '2016-06-01 23:00'::timestamp), '1 hour'::interval );
Now, for June 1 this still returns 24 hours as expected. But what about March 13 and November 6?
db=# SELECT generate_series( timezone('US/Pacific', '2016-03-13 00:00'::timestamp), timezone('US/Pacific', '2016-03-13 23:00'::timestamp), '1 hour'::interval ); generate_series ------------------------ 2016-03-13 08:00:00+00 2016-03-13 09:00:00+00 2016-03-13 10:00:00+00 2016-03-13 11:00:00+00 2016-03-13 12:00:00+00 2016-03-13 13:00:00+00 2016-03-13 14:00:00+00 2016-03-13 15:00:00+00 2016-03-13 16:00:00+00 2016-03-13 17:00:00+00 2016-03-13 18:00:00+00 2016-03-13 19:00:00+00 2016-03-13 20:00:00+00 2016-03-13 21:00:00+00 2016-03-13 22:00:00+00 2016-03-13 23:00:00+00 2016-03-14 00:00:00+00 2016-03-14 01:00:00+00 2016-03-14 02:00:00+00 2016-03-14 03:00:00+00 2016-03-14 04:00:00+00 2016-03-14 05:00:00+00 2016-03-14 06:00:00+00 (23 rows) db=# SELECT generate_series( timezone('US/Pacific', '2016-11-06 00:00'::timestamp), timezone('US/Pacific', '2016-11-06 23:00'::timestamp), '1 hour'::interval ); generate_series ------------------------ 2016-11-06 07:00:00+00 2016-11-06 08:00:00+00 2016-11-06 09:00:00+00 2016-11-06 10:00:00+00 2016-11-06 11:00:00+00 2016-11-06 12:00:00+00 2016-11-06 13:00:00+00 2016-11-06 14:00:00+00 2016-11-06 15:00:00+00 2016-11-06 16:00:00+00 2016-11-06 17:00:00+00 2016-11-06 18:00:00+00 2016-11-06 19:00:00+00 2016-11-06 20:00:00+00 2016-11-06 21:00:00+00 2016-11-06 22:00:00+00 2016-11-06 23:00:00+00 2016-11-07 00:00:00+00 2016-11-07 01:00:00+00 2016-11-07 02:00:00+00 2016-11-07 03:00:00+00 2016-11-07 04:00:00+00 2016-11-07 05:00:00+00 2016-11-07 06:00:00+00 2016-11-07 07:00:00+00 (25 rows)
It returned 23 distinct hours for March 13 and 25 hours for November 6.
To drive the point home, let’s repeat that with the
timezone config set to
US/Pacific. With this config, we can make use of the fact that a string being type cast to
timestamptz will be interpreted in the current time zone set by the
timezone config if the string lacks a time zone designator. Thus, we can skip the explicit wall time to absolute time conversion:
db=# SET timezone TO 'US/Pacific'; SET db=# SELECT generate_series( '2016-03-13 00:00'::timestamptz, '2016-03-13 23:00'::timestamptz, '1 hour'::interval ); generate_series ------------------------ 2016-03-13 00:00:00-08 2016-03-13 01:00:00-08 2016-03-13 03:00:00-07 2016-03-13 04:00:00-07 2016-03-13 05:00:00-07 2016-03-13 06:00:00-07 2016-03-13 07:00:00-07 2016-03-13 08:00:00-07 2016-03-13 09:00:00-07 2016-03-13 10:00:00-07 2016-03-13 11:00:00-07 2016-03-13 12:00:00-07 2016-03-13 13:00:00-07 2016-03-13 14:00:00-07 2016-03-13 15:00:00-07 2016-03-13 16:00:00-07 2016-03-13 17:00:00-07 2016-03-13 18:00:00-07 2016-03-13 19:00:00-07 2016-03-13 20:00:00-07 2016-03-13 21:00:00-07 2016-03-13 22:00:00-07 2016-03-13 23:00:00-07 (23 rows) db=# SELECT generate_series( '2016-11-06 00:00'::timestamptz, '2016-11-06 23:00'::timestamptz, '1 hour'::interval ); generate_series ------------------------ 2016-11-06 00:00:00-07 2016-11-06 01:00:00-07 2016-11-06 01:00:00-08 2016-11-06 02:00:00-08 2016-11-06 03:00:00-08 2016-11-06 04:00:00-08 2016-11-06 05:00:00-08 2016-11-06 06:00:00-08 2016-11-06 07:00:00-08 2016-11-06 08:00:00-08 2016-11-06 09:00:00-08 2016-11-06 10:00:00-08 2016-11-06 11:00:00-08 2016-11-06 12:00:00-08 2016-11-06 13:00:00-08 2016-11-06 14:00:00-08 2016-11-06 15:00:00-08 2016-11-06 16:00:00-08 2016-11-06 17:00:00-08 2016-11-06 18:00:00-08 2016-11-06 19:00:00-08 2016-11-06 20:00:00-08 2016-11-06 21:00:00-08 2016-11-06 22:00:00-08 2016-11-06 23:00:00-08 (25 rows)
With the hours printed in local time, it is easy to see at which point the daylight saving changes occurred.
A Valid Use Case For
timestamp without time zone
The only time we should be using
timestamptz is when the timestamp in question doesn’t denote an absolute point in time. I could only come up with one such scenario.
Imagine a system with various customers where each customer is located in a different time zone. We might want to enqueue a task to run at a certain time, e.g. at midnight, local to each customer. In such a situation, the timestamp (e.g.
2017-01-01 00:00) would not denote a single point in time, but potentially a different one for each customer. With a customer’s time zone we are able to convert that
timestamp to an absolute point in time for each customer given their respective time zone.
Here’s a simple example showcasing this:
db=# WITH customers(id, timezone) AS (VALUES (1, 'Australia/Sydney'), (2, 'Europe/Madrid'), (3, 'America/Los_Angeles')), scripts(name, trigger_at) AS (VALUES ('calculate_yearly_revenue', '2017-01-01'::timestamp), ('calculate_easter_revenue', '2016-03-28'::timestamp)) SELECT customers.id AS customer_id, scripts.name AS script_name, timezone(customers.timezone, scripts.trigger_at) AS trigger_at FROM customers, scripts; customer_id | script_name | trigger_at -------------+--------------------------+------------------------ 1 | calculate_yearly_revenue | 2016-12-31 13:00:00+00 2 | calculate_yearly_revenue | 2016-12-31 23:00:00+00 3 | calculate_yearly_revenue | 2017-01-01 08:00:00+00 1 | calculate_easter_revenue | 2016-03-27 13:00:00+00 2 | calculate_easter_revenue | 2016-03-27 22:00:00+00 3 | calculate_easter_revenue | 2016-03-28 07:00:00+00 (6 rows)
Notice how each
scripts.trigger_at of type
timestamp resulted in a different
timestamptz, namely one absolute point in time for each distinct time zone. Further, notice for Madrid and Los Angeles how midnight at Easter and midnight at New Year’s Day don’t result in the same UTC hour due to daylight saving.
Timestamps in PostgreSQL can be confusing. But once you understand them, working with timestamps and time zones in PostgreSQL becomes a joy. I hope that this article convinced you of that.
If you’re interested in further material regarding time and time zones in general and in PostgreSQL, I encourage you to check out these links:
- Wikipedia: ISO 8601
- PostgreSQL Documentation: Date/Time Types
- PostgreSQL Documentation: Date/Time Functions and Operators
- Hopes&Fears: The time zone rebels of the world
- Noah Sussman: Falsehoods programmers believe about time
- Noah Sussman: More falsehoods programmers believe about time
You might also like my other posts on PostgreSQL: