Category Archives: Perl

Schema change management with sqitch

Today I’ve been reading about Sqitch, a database management system designed by David Wheeler; who is also the author of the pgTap suite that I also recently looked at. What’s quite attractive is that Sqitch can be used with pretty much any DBMS and version control system.

The concept appears to be that database changes are scripted both ways; a “deploy” script containing SQL for the changes we want to make, a “revert” script containing the SQL to revert our changes, and also a “verify” script containing some SQL to check the “deploy” scripts work.

Making modifications to a script is essentially done by renaming the existing scripts with a version number, and copying the “deploy” script into both the “deploy” and “revert” scripts of the new version and editing from there. This is all done with the Sqitch “rework” command.

Sqitch also allows you to make one set of changes dependant on other changes, preventing application of changes out of order. All the scripts are then stored in a version control system of your choosing and can be deployed and managed directly through the command line tool sqitch (which has obviously been influenced by git).

It looks like a nice way of managing database changes, but I like to be able to see all scripts side by side and I’m not sure how I feel about segmenting every set of changes into separate files. I guess for most DDL this is fine, but when it comes to procedural language functions I want them in one place organised by relevance not deployment order. Of course there’s nothing stopping us using Sqitch purely for deployment change management whilst keeping the current development copies of scripts in a separate VCS, perhaps with some “glue” in between to create the “deploy” and “revert” scripts from diffs for us (not sure if we could automate the verify scripts).

I need to do a little more thinking on exactly how to make use of it for us, but there’s an excellent tutorial using Git and PostgreSQL here.

Unit testing in PostgreSQL with pgTap

The applications I work with use procedural functions and RDBMS features quite heavily, and although we have unit tests for application code, we currently have no unit testing directly in the database. I’ll be the first to admit that I can be fairly highly strung when making large changes to our systems, and having a suite of unit tests to run is a great way to extinguish any histrionics.

I’ve been looking at pgTap today which provides an excellent way to perform unit testing in PostgreSQL covering pretty much every aspect of the schema.

To install pgTap we just download, make and install just like any other extension:

$ cd /usr/pgsql_src
$ wget http://api.pgxn.org/dist/pgtap/0.94.0/pgtap-0.94.0.zip
$ unzip pgtap-0.94.0.zip && cd pgtap*
$ make
$ make installcheck PGUSER=glyn
$ sudo make install

The system I’m testing on is still running Postgres 9.0, so to install the extension I have to revert to running the sql script rather than “CREATE EXTENSION”, and for some reason the pgtap.sql script didn’t end up in my contrib directory along with the sql for the extension based versions, so I just copied it there manually before running it:

$ cp sql/pgtap.sql `pg_config --sharedir`/contrib
$ psql -d TEST -f `pg_config --sharedir`/contrib/pgtap.sql 

If this were 9.1+ that would just be just be:

$ psql -d TEST -c "CREATE EXTENSION pgtap"

Now pgTap is installed I can run a test that checks nothing. We perform all tests within a transaction which is rolled back at the end to ensure our tests do not make any persistent changes.

[postgres highlight=”1,3,8,13,17″]
TEST=# BEGIN;
BEGIN
TEST=# SELECT plan(1);
plan
——
1..1
(1 row)
TEST=# SELECT pass( ‘Testing nothing’ );
pass
————————
ok 1 – Testing nothing
(1 row)
TEST=# SELECT * FROM finish();
finish
——–
(0 rows)
TEST=# ROLLBACK;
ROLLBACK
[/postgres]

The output basically means; 1..1 we ran 1 of 1 tests and ok 1 – Testing nothing test number 1, with description “Testing nothing” returned “ok”. The above looks a bit messy and the docs recommend setting the following in psql to make the behaviour and output a little easier to handle:

[postgres]
\set ECHO
\set QUIET 1
\pset format unaligned
\pset tuples_only true
\pset pager
\set ON_ERROR_ROLLBACK 1
\set ON_ERROR_STOP true
\set QUIET 1
[/postgres]

Now lets create a function and perform an actual test on the results:

[postgres]
TEST=# CREATE OR REPLACE FUNCTION public.test1(text) RETURNS text AS ‘SELECT upper($1) || length($1)::text’ language ‘SQL’;

TEST=# BEGIN;
TEST=# SELECT plan(1);
1..1
TEST=# SELECT results_eq(E’SELECT * FROM public.test1(\’glyn\’)’, E’SELECT upper(\’glyn\’) || length(\’glyn\’)::text’,’public.test1() should return text in uppercase suffixed with length’);
ok 1 – public.test1() should return text in uppercase suffixed with length
TEST=# SELECT * FROM finish();
TEST=# ROLLBACK;
[/postgres]

We can also simply test if some operation like an insert succeeds:

[postgres]
TEST=# CREATE TABLE public.table1 (a text, b integer);

TEST=# BEGIN;
TEST=# SELECT plan(1);
1..1
TEST=# SELECT lives_ok(E’INSERT INTO public.table1 VALUES (\’TEST\’, 1)’);
ok 1
TEST=# SELECT * FROM finish();
TEST=# ROLLBACK;
[/postgres]

…and we can check that performing some operation fails with a particular exception:

[postgres]
TEST=# BEGIN;
TEST=# SELECT plan(1);
1..1
TEST=# SELECT throws_ok(E’INSERT INTO public.table1 VALUES (\’test\’, \’not an integer\’)’,’22P02′,NULL,’Column b is an integer, inserting text should fail’);
ok 1 – Column b is an integer, inserting text should fail
TEST=# SELECT * FROM finish();
TEST=# ROLLBACK;
[/postgres]

We can test if an operation created the expected data too, if I create a trigger on my test table to uppercase any text inserted, we can check if it worked as expected:

[postgres]
TEST=# CREATE FUNCTION public.table1_upper_func() RETURNS trigger AS $$ BEGIN NEW.a := upper(NEW.a); RETURN NEW; END; $$ LANGUAGE plpgsql VOLATILE;
TEST=# CREATE TRIGGER table1_upper_trig BEFORE INSERT OR UPDATE ON public.table1 FOR EACH ROW EXECUTE PROCEDURE public.table1_upper_func();

TEST=# BEGIN;
TEST=# SELECT plan(1);
1..1
TEST=# INSERT INTO public.table1 VALUES (‘test’,1);
TEST=# SELECT is(a, ‘TEST’, ‘Inserted values into column "a" should be uppercased’ ) FROM public.table1 WHERE b = 1;
ok 1 – Inserted values into column "a" should be uppercased
TEST=# SELECT * FROM finish();
TEST=# ROLLBACK;
[/postgres]

We can also test the existence and structure of a table:

[postgres]
TEST=# CREATE INDEX ON public.table1 USING btree(b);
TEST=#
TEST=# BEGIN;
TEST=# SELECT plan(6);
1..6
TEST=# SELECT has_table(‘public’, ‘table1’, ‘table public.table1 exists’ );
ok 1 – table public.table1 exists
TEST=# SELECT has_column(‘public’, ‘table1’, ‘a’, ‘column public.table1.a exists’);
ok 2 – column public.table1.a exists
TEST=# SELECT col_type_is(‘public’, ‘table1’, ‘a’, ‘text’, ‘Type of column public.table1.a is text’);
ok 3 – Type of column public.table1.a is text
TEST=# SELECT has_column(‘public’, ‘table1’, ‘b’, ‘column public.table1.b exists’);
ok 4 – column public.table1.b exists
TEST=# SELECT col_type_is(‘public’, ‘table1’, ‘b’, ‘integer’, ‘Type of column public.table1.b is integer’);
ok 5 – Type of column public.table1.b is integer
TEST=# SELECT has_index(‘public’, ‘table1′,’table1_b_idx’, ‘Table public.table1 has an index table1_b_idx’);
ok 6 – Table public.table1 has an index table1_b_idx
TEST=# SELECT * FROM finish();
TEST=# ROLLBACK;
[/postgres]

To wrap all of this up pgTap provides a runtests() function that will run predefined test functions without us having to define the number of tests with plan(n) or run finish(), and we can write these functions just like any other function incorporating the tests demonstrated above. By default the function looks for function names starting with test, it’s probably a good idea to put the tests in a specific schema or give them a more unique name than that.

When runtests() is run it also first looks for functions starting with “startup” or “shutdown” and these are run in alphabetical order before and after a whole set of test functions respectively. In addition functions starting with “setup” or “teardown” are, again, run in alphabetical order before and after each test function respectively. Typically these would be used to check, perform and undo any data changes required by each and or all tests, for example if we’ve called a sequence that doesn’t belong to a table during our tests, it’s value won’t be rolled back so we might want to reset it’s value.

One thing to clarify though (which came to my attention when running pg_prove blelow), is that where the “setup” and “teardown” functions are automatically rolled back at the end of each test, the “startup” and “shutdown” functions are not automatically rolled back at the end of the tests. If you want that functionality then you need to wrap your call to runtests in a transaction and roll it back at the end just as before.

E.g. we can create our test schema and functions:

[postgres]
CREATE SCHEMA unit_testing;

TEST=# CREATE OR REPLACE FUNCTION unit_testing.setup_tap_test_example()
RETURNS SETOF TEXT AS $$
BEGIN
INSERT INTO public.table1 VALUES (‘test’,1);
RETURN NEXT is(a, ‘TEST’, ‘Should have initial row in table1’) FROM public.table1 WHERE b = 1;
END;
$$ LANGUAGE plpgsql;

TEST=# CREATE OR REPLACE FUNCTION unit_testing.tap_test_example()
RETURNS SETOF TEXT AS $$
BEGIN
RETURN NEXT has_table(‘public’, ‘table1’, ‘table public.table1 exists’ );
RETURN NEXT has_column(‘public’, ‘table1’, ‘a’, ‘column public.table1.a exists’);
RETURN NEXT col_type_is(‘public’, ‘table1’, ‘a’, ‘text’, ‘Type of column public.table1.a is text’);
RETURN NEXT has_column(‘public’, ‘table1’, ‘b’, ‘column public.table1.b exists’);
RETURN NEXT col_type_is(‘public’, ‘table1’, ‘b’, ‘integer’, ‘Type of column public.table1.b is integer’);
RETURN NEXT has_index(‘public’, ‘table1′,’table1_b_idx’, ‘Table public.table1 has an index table1_b_idx’);
END;
$$ LANGUAGE plpgsql;

TEST=# CREATE OR REPLACE FUNCTION unit_testing.teardown_tap_test_example()
RETURNS SETOF TEXT AS $$
BEGIN
RETURN NEXT lives_ok(‘DELETE FROM public.table1 WHERE b = 1’);
END;
$$ LANGUAGE plpgsql;
[/postgres]

Now we can run them:
[postgres]
TEST=# BEGIN;
TEST=# SELECT * FROM runtests(‘unit_testing’,’^tap_test_example’);
ok 1 – Should have initial row in table1
# unit_testing.tap_test_example()
ok 2 – table public.table1 exists
ok 3 – column public.table1.a exists
ok 4 – Type of column public.table1.a is text
ok 5 – column public.table1.b exists
ok 6 – Type of column public.table1.b is integer
ok 7 – Table public.table1 has an index table1_b_idx
ok 8
1..8
TEST=# ROLLBACK;
[/postgres]

Gravy. We can, if we want, also install the pg_prove module from cpan to finish wrapping it all together with TAP::Harness and provide a summary of our tests:

$ sudo cpan TAP::Parser::SourceHandler::pgTAP
$ pg_prove -d TEST -R -s unit_testing -x '^tap_test_example'
runtests('unit_testing'::name, '^tap_test_example'::text); ..
# unit_testing.tap_test_example()
runtests('unit_testing'::name, '^tap_test_example'::text); .. ok
All tests successful.
Files=1, Tests=9,  0 wallclock secs ( 0.05 usr  0.02 sys +  0.00 cusr  0.01 csys =  0.08 CPU)
Result: PASS

This is just a small example of some of the tests we can do with pgTap, the suite provides a very extensive set of tests. See the documentation here.

Experimenting with Slony 2.2

I’ve been spending my spare bits of time over the last couple of weeks looking at the latest release of Slony-I. At a quick glance the main change between 2.1 and 2.2 appears to be to the sl_log table format, but although seemingly minor, the changes to the way clusters are failed over and reshaped actually go much deeper too.

For example in previous versions it was possible for a subscriber to pull multiple sets from different providers and later change the provider for any set at will using the “SCUBSCRIBE SET” command. However as of 2.2, although it’s still possible to initially subscribe a node with different providers for each set, any changes must use the “RESUBSCRIBE NODE” command, which only allows resubscribing all sets from a particular origin to a single provider.

There’s also changes to the “FAILOVER” command to improve reliability in a situation where multiple nodes have failed; you can now pass in multiple failed nodes and Slony should do the right thing. So far my tests with 2.2.2 show there may be some issues when passing in multiple failed nodes where one is a downstream provider to a cascaded subscriber, however that’s a corner case and hopefully we’ll see a fix soonish. (Edit 16/05/2014: There’s now a patch against 2.2.2 for this)

The changes to the sl_log table mean that replicated data is now replicated in a slightly more logical way; data is logged as arrays of values rather than chunks of sql to execute on the subscriber, and the data is sent over a pipe using copy rather than fetched in chunks via a cursor. Also DDL had been moved out of sl_event and into a new sl_log_script table. Upgrade will most likely require some brief downtime, as running update functions requires a lock of all sets and waiting out the cleanup interval for a cleanupevent/logswitch to happen to clear out the tables.

On a separate note, this evening (not the best use of a bank holiday weekend) whilst looking at how these changes would affect my experimental failover script I had a quick bash at adding in an “autofailover” functionality; the idea being that the script keeps polling all the nodes, and upon detecting any unavailable nodes runs the failover command. It’s a functionality I’ve never personally wanted as it’s possible to get into all sorts of trouble blindly failing over onto an asynchronous replica, in fact in a busy environment it’s pretty much guaranteed (E.g missing a single update to a product price and then taking millions of sales on the wrong price!). However, perhaps it could be quite useful in a mostly read only environment where updates are low volume such as a wiki; more thought needed I think.

(The script by the way is here slony-i/slony_failover and also mirrored on github glynastill/slony_failover)