MySQL versus PostgreSQL: Adding a 'Last Modified Time' Column to a Table
This is the second post here detailing my ongoing adventures with PostgreSQL. This time I had a requirement to add a "timestamp" column to a table. The point of this being to allow us to track the "last modified" time of a row, without requiring that the application code manage the timestamp itself.
There's a lot of reasons why you might wish to do this. In this case it was to simplify syncing the data into a data warehouse. More specifically, to allow the DBA to easily identify rows which have changed since the last import.
Having done this a couple of times in MySQL, I assumed that the process would be straightforward. I should know better by now!
MySQL
MySQL provides a TIMESTAMP column type, for exactly this purpose. It's formatted as a standard DATETIME, but can optionally be configured to automatically set itself to the current time when the row is inserted (on by default) and/or updated (off by default). It's a breeze to add a column of this type to a table, like so:
ALTER TABLE mytable
ADD lastmodified TIMESTAMP
DEFAULT CURRENT_TIMESTAMP
ON UPDATE CURRENT_TIMESTAMP;
Notice the crucial ON UPDATE clause, which tells MySQL to update the timestamp when the row is modified.
All rows in the lastmodified column are initially set to the 'zero' date, i.e. 0000-00-00 00:00:00. If we want to backfill the column with another value we can do that in a further step. This example sets each row's timestamp to the current date and time:
UPDATE mytable
SET lastmodified=CURRENT_TIMESTAMP
Pretty straightforward, all in all.
PostgreSQL
Let's dive right in and view the Postgres solution in all its glory:
ALTER TABLE mytable
ADD lastmodified TIMESTAMP;
ALTER TABLE mytable
ALTER COLUMN lastmodified
SET DEFAULT CURRENT_TIMESTAMP;
UPDATE mytable
SET lastmodified=CURRENT_TIMESTAMP;
CREATE OR REPLACE FUNCTION update_lastmodified_column()
RETURNS TRIGGER AS '
BEGIN
NEW.lastmodified = NOW();
RETURN NEW;
END;
' language 'plpgsql';
CREATE TRIGGER update_lastmodified_modtime BEFORE UPDATE
ON mytable FOR EACH ROW EXECUTE PROCEDURE
update_lastmodified_column();
That's a fair deal more complex. The big obstacle here is that Postgres doesn't have the equivalent of MySQL's TIMESTAMP type. It does a have type named TIMESTAMP, but this is analogous to MySQL's DATETIME. We'll have to reverse engineer the behaviour we want.
We'll step though the process and have a closer look at what's involved. We'll start by creating the column and specifying that inserted rows should automatically default to the current date and time. As we've seen before, PostgreSQL won't allow us to do that in one step.
ALTER TABLE mytable
ADD lastmodified TIMESTAMP;
ALTER TABLE mytable
ALTER COLUMN lastmodified
SET DEFAULT CURRENT_TIMESTAMP;
Again, we may wish to backfill the column:
UPDATE mytable
SET lastmodified = CURRENT_TIMESTAMP;
As an aside, I like that Postgres has some convenient shorthand values for commonly used dates. Alongside CURRENT_TIMESTAMP, there's such options as "yesterday", "epoch" and the cosmologically questionable "-infinity". They're documented here.
So far we've created the column, backfilled it with sane values, and any newly inserted rows will be timestamped appropriately. But what about updates to rows? This is where the fun really begins.
Lacking MySQL's ON UPDATE clause, which we noted earlier, we need to create a TRIGGER. A trigger is a kind of "event handler" which is fired off (well, triggered) in response to some action, in this case UPDATEs. To make things even more interesting, Postgres deviates from the SQL99 standard by not allowing us to run SQL directly within the trigger: any functionality we require must be defined in a stored proc:
CREATE OR REPLACE FUNCTION update_lastmodified_column()
RETURNS TRIGGER AS '
BEGIN
NEW.lastmodified = NOW();
RETURN NEW;
END;
' language 'plpgsql';
NEW is a special keyword which refers to the new version of the row. Similarly, OLD is available to us, should we need to access the previous values within the row.
Finally, we have to attach that proc to the relevant table by use of the trigger:
CREATE TRIGGER update_lastmodified_modtime BEFORE UPDATE
ON mytable FOR EACH ROW EXECUTE PROCEDURE
update_lastmodified_column();
MySQL's single command has succesfully been implemented as a dozen or so lines of Postgres code. I'll give round two to MySQL, but I'll give points to PostgreSQL for making my day a little more interesting than it might otherwise have been!
Related Reading
- PostgreSQL 9.0 High Performance by Gregory Smith
- PostgreSQL 9 Administration Cookbook by S. Riggs and H. Krosing
- High Performance MySQL: Optimization, Backups, Replication, and More
fansipans
Unless I'm missing something, all of the trigger contortions above can be obviated by setting the column's default value to "now()".
One of the nice features of PostgreSQL is that you can have a function call as the default, and it will be evaluated on INSERT.
test=# create table foo (bar integer, baz timestamp default now());
CREATE TABLE
test=# insert into foo (bar) values (12);
INSERT 0 1
test=# select * from foo;
bar | baz
-----+----------------------------
12 | 2009-04-10 09:09:52.942531
(1 row)
fansipans
Oh dur, I didn't notice you needed it on UPDATE - right. Then you'd need a trigger ;)
Kevin F
As an extension to the discussion, mySQL only supports one such timestamp per table. Thus, if your requirements are to track an INSERT date and an UPDATE date column, for instance, then you would also have to use mysql triggers, as in postgres. It's a strange restriction. beware.
westcoast coder
Thanks for this post. It saved my day as I've been scratching to do so since it was so easy to do in MySQL and struggling to do in postgres
Daniele Pignedoli
Great article, i found it very usefull... anyway i hope the postgres team will soon create a automatic function for this case (as the serial aka auto_increment)
Ben Maden
Great article... I was looking for a way to do this across all tables. I'm sure there must be a way.... not found it but did find this...
Installed the SPI module
http://www.postgresql.org/docs/current/static/contrib-spi.html
psql -U postgres db_name < ./contrib/moddatetime.sql
(needed yum install postgres-contrib first)
So any table can now have the following trigger added....
CREATE TRIGGER table_update BEFORE UPDATE
ON table_name FOR EACH ROW EXECUTE PROCEDURE
moddatetime('updated_at');
Similar to your code but you can now pass in the update field name as required... just in case you don't have a convention.
Thanks again for the great post.
cheers,
Ben
Ludwig Kniprath
Thanks a lot,
as extension you can add
NEW.last_update_by = session_user;
to the function above to get the equivalent of mySQLs
ON UPDATE CURRENT_USER;
(assumed the fieldname is last_update_by)
Iceblue
This trigger do not give the same result as mysql timestamp. With this solution its not possible to set the timestamp field itself manually without messing it up. Mysql timestamp allowed that...I really dont see how to achieve this in postgresql
Simon Harris
Thanks, Iceblue. That's interesting if true.
Really though, the whole point of a timestamp column is that you don't set it yourself, but I acknowledge that MySQL allows this, for better or worse.
Chris Angelico
Yes, it's easier in MySQL. But Postgres gives the huge advantage of flexibility. This is something I see in quite a few places; it's easier to set up a dynamic web site using MySQL, but Postgres lets you use its tools in more ways. 99% of situations are covered by MySQL's options, but the other 1% are actually quite fiddly to work with; Postgres covers everything, but forces you to be a bit more verbose to do it.
I would like a more concise trigger notation though. Forcing triggers to be stored procedures is a bit annoying.
alias
With regard to Iceblue's comment about the procedure not allowing a purposely-set value in the timestamp column, the procedure can be rigged up to allow for this:
create or replace function update_timestamp()
returns trigger as $$
begin
if new.lastmodified is null or new.lastmodified = old.lastmodified then
new.lastmodified = now();
end if;
return new;
end;
$$ language 'plpgsql';
Of course, this assumes you're not trying to explicitly set the existing value back into the field, so there is still a minor uncovered corner case to consider.
Daniel
Thanks! This was helpful.