Ross Coded Classes

Programming tips with no rose colored glasses.

Locking on rows or values that don’t exist in PostgreSQL

Image from https://www.pexels.com/photo/door-green-closed-lock-4291/

I came across a situation where for concurrency of two requests trying to make the same resource in the database, I wanted the other requests to block until the resource was inserted and then continue without error.

A similar problem is perhaps trying to determine the largest invoice number in a system and acquire the next number for the invoice you are inserting, without an error because concurrent requests inserted the same invoice number.

I had to think about this one quite a bit to find alternatives. I researched how to lock a non-existing record and it appears to me that you can’t in PostgreSQL even at ISOLATION LEVEL SERIALIZABLE.

I did find that in READ COMMITTED that “UPSERT” (INSERT … ON CONFLICT) can help solve the problem by blocking until another transaction inserts the record and then not attempt to insert it again.

-- set up a test table
create table t (id text primary key, other text);

This works: Using ON CONFLICT to make sure that the record is only inserted once and is not updated by a second concurrent transaction:

-- run two transactions with these statements (statement for statement)
begin transaction isolation level read committed;
select txid_current();
-- optionally abort if it already exists
select * from t where id = 'tryupsert';
-- optionally if found exit

INSERT INTO t (id, other) VALUES ('tryupsert', txid_current()::text)
ON CONFLICT (id) DO NOTHING;

-- second transaction blocks on the above
select * from t where id = 'tryupsert';
commit;
-- second transaction unblocked now
-- The first transaction inserted the row if it did not exist
-- The second transaction executed the DO NOTHING.

This works: Using ON CONFLICT to make sure that the record exists and the last transaction wins on the other column value (all transactions change the value of the row):

-- run two transactions with these statements (statement for statement)
begin transaction isolation level read committed;
select txid_current();

INSERT INTO t (id, other)
SELECT 'tryupsert' AS id, txid_current()::text AS other
ON CONFLICT (id) DO UPDATE SET (id, other) = (excluded.id, excluded.other)
RETURNING *;

-- second transaction blocks on the above
commit;
-- second transaction unblocked now
-- The first transaction inserted the row if it did not exist
-- or updated it if it did exist.
-- The second transaction updated the row.

This works: Using advisory locks:

-- run two transactions with these statements (statement for statement)
begin transaction isolation level read committed;
select txid_current();
select * from t where id = 'advisory';
-- if found exit

select pg_advisory_xact_lock(hashtext('advisory'));
-- second transaction blocks on the above
select * from t where id = 'advisory';
-- if found exit
-- second transaction will unblock when first tx commits and see the row 
-- was already inserted from the first transaction and skip the insert.

INSERT INTO t (id, other) VALUES ('advisory', txid_current()::text);

select * from t where id = 'advisory';
commit;

Using SERIALIZABLE doesn’t work. This makes sense after understanding better (yet again because I keep forgetting how it works – it is kind of designed for less blocking and code that is built to retry on serialization failure)

begin transaction isolation level serializable;
select * from t where id = 'tryserializable' for update;
-- if found exit

-- second transaction does not block
insert into t values ('tryserializable', txid_current()::text);
-- second transaction block on the above
select * from t where id = 'tryserializable';
-- id | other
-- -----------------+-------
-- tryserializable | 628
commit;

The second transaction ends with this error:

ERROR: could not serialize access due to read/write dependencies among transactions
DETAIL: Reason code: Canceled on identification as a pivot, during write.
HINT: The transaction might succeed if retried.

Installing Multiple Instances (Clusters) of PostgreSQL on CentOS 7

Today I installed multiple instances of PostgreSQL 9.4 on one CentOS 7 machine. I learned a little bit about systemd and how it can adjust the Out-Of-Memory killer with OOMScoreAdjust.

I also learned about data page checksums that were added in PostgreSQL 9.3 but not enabled by default. (Coming from SQL Server this corresponds to PAGE_VERIFY CHECKSUM which is a per-database setting. CHECKSUM was introduced in SQL Server 2005 and is the default for new databases and can be changed at any time.)

I happened to find a similar post about installing multiple PostgreSQL instances but I already had the postgres binaries, I just needed to configure additional services to run the different clusters.

# Installing multiple instances of Postgresql 9.4 on CentOS 7 64-bit using systemd
# This disables the Out-Of-Memory killer for the main process

# Change these three variables to pick an instance name, port 
# and data directory. I suggest using a directory name that is the
# same as the instance name but you can make that different.
MYPGINSTANCE=cluster1
MYPGPORT=5432
MYPGDATA=/data/pgsql/9.4/${MYPGINSTANCE}

# Create a data directory and change its ownership
mkdir –p ${MYPGDATA}
chown postgres:postgres ${MYPGDATA}

# initialize the database using user postgres and enable page checksums (introduced in 9.3)
su postgres -c "/usr/pgsql-9.4/bin/initdb -k -D ${MYPGDATA} -U postgres"

# Use heredoc to create a new .service file for this instance:
cat >/etc/systemd/system/postgresql-9.4-${MYPGINSTANCE}.service <<EOF
.include /lib/systemd/system/postgresql-9.4.service

[Service]
Environment=PGDATA=${MYPGDATA}
Environment=PGPORT=${MYPGPORT}
EOF

# Lets see if this looks right...
cat /etc/systemd/system/postgresql-9.4-${MYPGINSTANCE}.service

# Enable the service at startup, start it, and check its status
systemctl enable postgresql-9.4-${MYPGINSTANCE}
systemctl start postgresql-9.4-${MYPGINSTANCE}
systemctl status postgresql-9.4-${MYPGINSTANCE}

# Lets see the logs for this new service before the postgresql logging takes over
journalctl -u postgresql-9.4-${MYPGINSTANCE}