Tag Archives: PostgreSQL

Wanting for a hstore style delete operator in jsonb

PostgreSQL 9.4 intorduced the jsonb type, but it’d be nice to be able to delete keys and pairs using the “-” operator; just like you can with the hstore type.

Fortunately postgres makes creating an operator really easy for us, so lets have a go at creating a delete operator for jsonb.

First lets try to create an operator just to delete one key passed as text. We need to start by creating a function for our operator, and the only way I can think to do this looking at the docs is to unwrap the json with jsonb_each, filter out the matches, and roll it all back up:

[postgres]
TEST=# CREATE OR REPLACE FUNCTION jsonb_delete_left(a jsonb, b text)
RETURNS jsonb AS
$BODY$
SELECT COALESCE(
(
SELECT (‘{‘ || string_agg(to_json(key) || ‘:’ || value, ‘,’) || ‘}’)
FROM jsonb_each(a)
WHERE key <> b
)
, ‘{}’)::jsonb;
$BODY$
LANGUAGE sql IMMUTABLE STRICT;
CREATE FUNCTION

TEST=# COMMENT ON FUNCTION jsonb_delete_left(jsonb, text) IS ‘delete key in second argument from first argument’;
COMMENT
[/postgres]

Once we’ve created our function, we just need to create the operator to use it:

[postgres]
TEST=# CREATE OPERATOR – ( PROCEDURE = jsonb_delete_left, LEFTARG = jsonb, RIGHTARG = text);
CREATE OPERATOR
TEST=# COMMENT ON OPERATOR – (jsonb, text) IS ‘delete key from left operand’;
COMMENT
[/postgres]

And we’re ready to go:

[postgres]
TEST=# SELECT ‘{"a": 1, "b": 2, "c": 3}’::jsonb – ‘b’::text;
?column?
——————
{"a": 1, "c": 3}
(1 row)
[/postgres]

Seems to work, lets now try to create one that will let us delete a set of keys passed as an array:

[postgres]
TEST=# CREATE OR REPLACE FUNCTION jsonb_delete_left(a jsonb, b text[])
RETURNS jsonb AS
$BODY$
SELECT COALESCE(
(
SELECT (‘{‘ || string_agg(to_json(key) || ‘:’ || value, ‘,’) || ‘}’)
FROM jsonb_each(a)
WHERE key <> ALL(b)
)
, ‘{}’)::jsonb;
$BODY$
LANGUAGE sql IMMUTABLE STRICT;
CREATE FUNCTION

TEST=# COMMENT ON FUNCTION jsonb_delete_left(jsonb, text[]) IS ‘delete keys in second argument from first argument’;
COMMENT

TEST=# CREATE OPERATOR – ( PROCEDURE = jsonb_delete_left, LEFTARG = jsonb, RIGHTARG = text[]);
CREATE OPERATOR

TEST=# COMMENT ON OPERATOR – (jsonb, text[]) IS ‘delete keys from left operand’;
COMMENT

TEST=# SELECT ‘{"a": 1, "b": 2, "c": 3}’::jsonb – ARRAY[‘a’,’b’];
?column?
———-
{"c": 3}
(1 row)
[/postgres]

Ok, so now lets create one to delete matching key/value pairs:

[postgres]
TEST=# CREATE OR REPLACE FUNCTION jsonb_delete_left(a jsonb, b jsonb)
RETURNS jsonb AS
$BODY$
SELECT COALESCE(
(
SELECT (‘{‘ || string_agg(to_json(key) || ‘:’ || value, ‘,’) || ‘}’)
FROM jsonb_each(a)
WHERE NOT (‘{‘ || to_json(key) || ‘:’ || value || ‘}’)::jsonb <@ b
)
, ‘{}’)::jsonb;
$BODY$
LANGUAGE sql IMMUTABLE STRICT;
CREATE FUNCTION

TEST=# COMMENT ON FUNCTION jsonb_delete_left(jsonb, jsonb) IS ‘delete matching pairs in second argument from first argument’;
COMMENT

TEST=# CREATE OPERATOR – ( PROCEDURE = jsonb_delete_left, LEFTARG = jsonb, RIGHTARG = jsonb);
CREATE OPERATOR

TEST=# COMMENT ON OPERATOR – (jsonb, jsonb) IS ‘delete matching pairs from left operand’;
COMMENT

TEST=# SELECT ‘{"a": 1, "b": 2, "c": 3}’::jsonb – ‘{"a": 4, "b": 2}’::jsonb;
?column?
——————
{"a": 1, "c": 3}
(1 row)
[/postgres]

Seems to work fine to me, lets try an index:

[postgres]
TEST=# \timing on
Timing is on
TEST=# CREATE TABLE jsonb_test (a jsonb, b jsonb);
CREATE TABLE
Time: 207.038 ms

TEST=# INSERT INTO jsonb_test VALUES (‘{"a": 1, "b": 2, "c": 3}’, ‘{"a": 4, "b": 2}’);
INSERT 0 1
Time: 39.979 ms

TEST=# SELECT * FROM jsonb_test WHERE a-b = ‘{"a": 1, "c": 3}’::jsonb;
a | b
————————–+——————
{"a": 1, "b": 2, "c": 3} | {"a": 4, "b": 2}
(1 row)

Time: 47.197 ms

TEST=# INSERT INTO jsonb_test
SELECT (‘{"a" : ‘ || i+1 || ‘,"b" : ‘ || i+2 || ‘,"c": ‘ || i+3 || ‘}’)::jsonb,
(‘{"a" : ‘ || i+2 || ‘,"b" : ‘ || i || ‘,"c": ‘ || i+5 || ‘}’)::jsonb
FROM generate_series(1,1000) i;
INSERT 0 1000
Time: 84.765 ms

TEST=# CREATE INDEX ON jsonb_test USING gin((a-b));
CREATE INDEX
Time: 229.050 ms
TEST=# EXPLAIN SELECT * FROM jsonb_test WHERE a-b @> ‘{"a": 1, "c": 3}’;
QUERY PLAN
———————————————————————————–
Bitmap Heap Scan on jsonb_test (cost=20.26..24.52 rows=1 width=113)
Recheck Cond: ((a – b) @> ‘{"a": 1, "c": 3}’::jsonb)
-> Bitmap Index Scan on jsonb_test_expr_idx (cost=0.00..20.26 rows=1 width=0)
Index Cond: ((a – b) @> ‘{"a": 1, "c": 3}’::jsonb)
(4 rows)

Time: 13.277 ms
[/postgres]

All seems to work as expected. I guess the one thing I’m not so certain about here is if any of this behaves correctly once we start getting nested json, but at first glance it doesn’t look too wonky to me:

[postgres]
TEST=# SELECT ‘{"a": 1, "b": 2, "c": 3, "d": {"a": 4}}’::jsonb – ‘{"d": {"a": 4}, "b": 2}’::jsonb;
?column?
——————
{"a": 1, "c": 3}

TEST=# SELECT ‘{"a": 4, "b": 2, "c": 3, "d": {"a": 4}}’::jsonb – ‘{"a": 4, "b": 2}’::jsonb;
?column?
————————-
{"c": 3, "d": {"a": 4}}
(1 row
[/postgres]

Of course being written in sql these probably aren’t anywhere near as fast as the hstore equivalents which are written in C, so it’d be nice to see something in core postgres to do this.

Logging information on long running transactions in PostgreSQL

PostgreSQL has some runtime parameters we can use to monitor long running queries; we can set log_min_duration_statement to log any statements that take longer than the value in milliseconds we set. But what can we do if we have an application that keeps a long running transaction open but is mainly just ‘<IDLE> in transaction’? Perhaps it’s pinning that transaction open whilst it performs some long running action elsewhere? We can use a monitoring tool like Nagios to show long running transactions, but the activity they actually do won’t be logged by postgres unless they’re blocking somewhere. What comes to mind here, is distributed transaction managers playing up.

For example here if we set log_min_duration_statement to 0 postgres will log every query we issue, and we’ll see a message like this in the log:

[postgres]
LOG: duration: 3.475 ms statement: SELECT * FROM pg_stat_database where datname=’TEST’
[/postgres]

We can set statement_timeout to forcefully prevent applications running queries that take longer than we would like them to, so any statement that exceeds statement_timeout in milliseconds will be cancelled, for example:

[postgres]
TEST=# set statement_timeout = ‘2ms’;
SET
TEST=# SELECT * FROM pg_stat_database where datname=’TEST’;
ERROR: canceling statement due to statement timeout
[/postgres]

Obviously setting such a low value is not much use, but a typical use case for this would be in a busy web application where a user is likely to have gone away after 10 seconds anyway, in which case we could alter the applications user to have a statement_timeout somewhere around 10 seconds.

To log blocking queries we can set log_lock_waits to on, in which case postgres will log details of blocking queries when it checks for deadlock after the deadlock_timeout interval (default 1 second). We’ll see a message something like this in the log depending on the type of lock:

[postgres]
LOG: process 16532 still waiting for ShareLock on transaction 311823400 after 1000.147 ms
STATEMENT: UPDATE balls SET description = ‘TEST’ WHERE id = 5;
[/postgres]

The above tells us that after 1 second our statement was still waiting on a lock, this is not to be confused with deadlock; the deadlock_timeout parameter factors into this only because postgres may as well output this information whilst it is doing the work to check for deadlock anyway. Actual deadlock would look something like this:

[postgres]
LOG: process 4465 detected deadlock while waiting for ShareLock on transaction 311824216 after 1000.158 ms
STATEMENT: UPDATE balls SET description = ‘TEST’ WHERE id = 5;
ERROR: deadlock detected
DETAIL: Process 4465 waits for ShareLock on transaction 311824216; blocked by process 16532.
Process 16532 waits for ShareLock on transaction 311824207; blocked by process 4465.
Process 4465: UPDATE balls SET name = ‘TEST’ WHERE id = 5;
Process 16532: UPDATE balls SET name = ‘TEST2’ WHERE id = 6;
[/postgres]

Whilst all of the above is useful for tracking down problems, the one thing that we can’t do do with runtime parameters is timeout or log long running transactions, and perhaps even log the statements that may only persist for short periods within the transaction. In most cases we’d be using some sort of monitoring platfom like nagios, cacti or munin to monitor transaction times along with everything else (including query times and locks mentioned above), but sometimes it’d be nice to see that information in the postgres logs too.

Generally if some application is keeping long running or idle transactions open, then it’s the application that needs to be fixed. But when that’s out of your hands you can just feel a bit stuck whilst you wait for that to happen, I’ve a pl/pgsql function that can be scheduled to log some information, and even cancel/terminate sessions. The function can be downloaded here or here

For example if we want to log information on long running transactions, we could run the following:

[postgres]
TEST=# SELECT public.pg_log_long_xact(‘1s’);
NOTICE: long_xact pid: 4465 duration: 431.325747 ms user: glyn application: psql client: [local] statement: <IDLE> in transaction
NOTICE: long_xact pid: 16532 duration: 327.438302 ms user: glyn application: psql client: [local] statement: UPDATE balls SET description = ‘TEST’ WHERE id = 5;
pg_log_long_xact
—————
(0 rows)
[/postgres]

If we want to see if either of these is waiting on the other:

[postgres]
TEST=# SELECT public.pg_log_long_xact(‘1s’, true);
NOTICE: long_xact pid: 4465 duration: 471.885373 ms user: glyn application: psql client: [local] statement: <IDLE> in transaction
NOTICE: long_xact pid: 16532 duration: 367.997928 ms user: glyn application: psql client: [local] statement: UPDATE balls SET description = ‘TEST’ WHERE id = 5;
NOTICE: long_xact waiter pid: 16532 blocker detail is; pid: 4465 duration: 471.885373 ms relation: any (public.balls (RowExclusiveLock)) lock type: transaction id 311824482 user: glyn application: psql client: [local] statement: <IDLE> in transaction
pg_log_long_xact
—————
(0 rows)
[/postgres]

We can set the level of raise it uses (in hindsight the default should probably have been LOG), or have it output the details as text:

[postgres]
TEST=# \t off
Showing only tuples.
TEST=# SELECT public.pg_log_long_xact(‘1s’, true, ‘text’);
long_xact pid: 4465 duration: 574.477076 ms user: glyn application: psql client: [local] statement: <IDLE>; in transaction
long_xact pid: 16532 duration: 470.589631 ms user: glyn application: psql client: [local] statement: UPDATE balls SET description = ‘TEST’ WHERE id = 5;
long_xact waiter pid: 16532 blocker detail is; pid: 4465 duration: 574.477076 ms relation: any (public.balls (RowExclusiveLock)) lock type: transaction id 311824482 user: glyn application: psql client: [local] statement: <IDLE> in transaction
[/postgres]

If we’re really having issues, we can make it start cancelling statements on each run, starting with those that have been blocking for the longest else those that have been running for the longest duration. As of pg 9.3 there’s now a lock_timeout parameter that will abort any statement waiting longer than the specified number of milliseconds, which is much better, but even post 9.3 the above function can be useful (Note that the difference here is that the above function will attempt to abort the blocking transaction rather than the waiting statement, or if there is no locking just the longest transaction. Obviously not to be used in haste.):

[postgres]
TEST=# SELECT public.pg_log_long_xact(‘1s’, true, ‘text’, ’10 minutes’);
long_xact pid: 4465 duration: 895.57988 ms user: glyn application: psql client: [local] statement: <IDLE> in transaction
long_xact unable to cancel backend with pid: 4465
long_xact pid: 16532 duration: 791.692435 ms user: glyn application: psql client: [local] statement: UPDATE balls SET description = ‘TEST’ WHERE id = 5;
long_xact waiter pid: 16532 blocker detail is; pid: 4465 duration: 895.57988 ms relation: any (public.balls (RowExclusiveLock)) lock type: transaction id 31182
4482 user: glyn application: psql client: [local] statement: <IDLE> in transaction
[/postgres]

In this case it can’t cancel the blocking statement as it is idle in trasaction, so there’s nothing to cancel, but we can pass another timeout to start terminating backends after:

[postgres]
TEST=# SELECT public.pg_log_long_xact(‘1s’, true, ‘text’, ’10 minutes’, ’15 minutes’);
long_xact pid: 4465 duration: 1026.90736 ms user: glyn application: psql client: [local] statement: <IDLE> in transaction
long_xact terminated backend with pid: 4465
long_xact pid: 16532 duration: 923.019915 ms user: glyn application: psql client: [local] statement: UPDATE balls SET description = ‘TEST’ WHERE id = 5;
[/postgres]

On out next run it has “resolved” the issue, I say “resolved” because it’s removed the blocking session:

[postgres]
TEST=#\t on
TEST=# SELECT public.pg_log_long_xact(‘1s’, true, ‘text’, ’10 minutes’, ’15 minutes’);
pg_log_long_xact
—————
(0 rows)
[/postgres]

By default the function only cancels/terminates the longest running blocker, or longest running transaction, we can be more forceful than that by passing an extra parameter to tell it to cancel statements/terminate backends of all long running transactions it finds (I’ve lowerd the timeouts purely because I had to restart the transactions/statements):

[postgres]
TEST=#\t off
TEST=# SELECT public.pg_log_long_xact(‘1s’, true, ‘text’, ‘2 minutes’, ‘3 minutes’, true);
long_xact pid: 19065 duration: 187.279089 ms user: glyn application: psql client: [local] statement: <IDLE> in transaction
long_xact terminated backend with pid: 19065
long_xact pid: 16532 duration: 184.251569 ms user: glyn application: psql client: [local] statement: UPDATE balls SET description = ‘TEST’ WHERE id = 5;
long_xact cancelled backend with pid: 16532
(4 rows)
[/postgres]

What would be nice to see in addition to the lock_timeout parameter would be a blocker_timeout parameter that would function just like the lock_timeout but abort the blocking session. Obviously such a parameter could just cause more havoc if the offending application just keeps retrying the blocking operation, or worse you have it set and it aborts the restructure you’ve been waiting hours for. But I think in some instances it could be useful, set to a large enough value we could kill the offending blocking transaction and let all the others complete.

Update 30/09/2016 – Postgres 9.6 adds idle_in_transaction_session_timeout which allows us to automatically terminate any transactions that sit idle for longer than the value in milliseconds we set.

Schema change management with sqitch

Today I’ve been reading about Sqitch, a database management system designed by David Wheeler; who is also the author of the pgTap suite that I also recently looked at. What’s quite attractive is that Sqitch can be used with pretty much any DBMS and version control system.

The concept appears to be that database changes are scripted both ways; a “deploy” script containing SQL for the changes we want to make, a “revert” script containing the SQL to revert our changes, and also a “verify” script containing some SQL to check the “deploy” scripts work.

Making modifications to a script is essentially done by renaming the existing scripts with a version number, and copying the “deploy” script into both the “deploy” and “revert” scripts of the new version and editing from there. This is all done with the Sqitch “rework” command.

Sqitch also allows you to make one set of changes dependant on other changes, preventing application of changes out of order. All the scripts are then stored in a version control system of your choosing and can be deployed and managed directly through the command line tool sqitch (which has obviously been influenced by git).

It looks like a nice way of managing database changes, but I like to be able to see all scripts side by side and I’m not sure how I feel about segmenting every set of changes into separate files. I guess for most DDL this is fine, but when it comes to procedural language functions I want them in one place organised by relevance not deployment order. Of course there’s nothing stopping us using Sqitch purely for deployment change management whilst keeping the current development copies of scripts in a separate VCS, perhaps with some “glue” in between to create the “deploy” and “revert” scripts from diffs for us (not sure if we could automate the verify scripts).

I need to do a little more thinking on exactly how to make use of it for us, but there’s an excellent tutorial using Git and PostgreSQL here.