Monday, 3 June 2019

Increasing data accessibility through lock avoidance via across-member Currently Committed semantics in Db2 pureScale

- David Sciaraffa, Software Engineering Manager – IBM Db2




In typical application environments, many concurrently running applications will attempt to read, insert, or update database data simultaneously.  Access to database data must be isolated from potential concurrency conflicts, so that operations are predictable with and without the presence of other concurrently executing applications.  “Isolation” is a capstone of A.C.I.D. compliant database systems.

The degree to which an application can see the data changes of other concurrently running applications is determined by its isolation levelSome isolation levels (like ‘repeatable read’ - RR) are very restrictive, others (like ‘uncommitted read’ – UR) are very liberal, while the most common (‘cursor stability’ – CS) strikes a good balance for most applications.

Db2 achieves isolation through multiple methodologies whose complexity is mostly outside the scope of this blog. The most basic methodology comes in the form of database locks. The read, insert, or update of a data row by an application may result in the database manager acquiring and/or holding a lock on the row, depending on the configured isolation level.

Focusing on the most common type of isolation (‘cursor stability’ – CS), when reading a row of data an application using this isolation level desires to read the most recently committed version of the row (ignoring any in-flight changes on the row by other uncommitted transactions). An application reading data with CS isolation may obtain a row lock on the row before processing it to ensure that only committed data is passed back to the user. The row lock is released after reading the row data.

In early versions of Db2, prior to the implementation of currently committed semantics, a row reader using CS isolation must wait for an in-flight application updating a row to commit, before acquiring the lock on the row in order to process it. This could be a significant bottleneck in high concurrency workloads.

With the implementation of currently committed semantics, a row reader using CS isolation is able to read the currently committed version of a row, bypassing any in-flight updates on that row.


Consider the following diagram depicting a row read application in lock-wait due to a concurrent row updater, with no currently committed semantics enabled:


[1] A user (App1) requests to update a row on table Tab1.
[2] App1 is granted a row lock on this row in exclusive (X) mode.
[3] App1 updates the row in the table, and a log-record is written into the recovery log stream (this log-record contains the original version of the row).  App1 has not yet committed its unit-of-work, so this row change is still in-flight.
[4] Another user (App2) requests to read this same row (CS isolation is inferred, but not depicted).
[5] App2 must acquire a row lock on this row to ensure to read the currently committed version of the row. It cannot acquire the row lock because an uncommitted in-flight transaction. App2 must wait until App1 committed and release the row lock, before proceeding.


With currently committed semantics enabled, a row reader application can now access the currently committed version of the row, bypassing the lock held by the row updater.  This is achieved by retrieving the currently committed row directly from the recovery log-stream. Information about where to find the currently committed version of the row within the recovery log stream is maintained with the lock-request-control-block (LRB) for the row, as depicted:


[1] A user (App1) requests to update a row on table Tab1.
[2] App1 is granted a row lock on this row in exclusive (X) mode.
[3] App1 updates the row in the table, and a log-record is written into the recovery log stream (this log-record contains the original version of the row). A pointer to the original version of the row in the recovery log stream is stored within the row lock information structure (LRB). App1 has not yet committed its unit-of-work, so this row change is still in-flight.
[4] Another user (App2) requests to read this same row (CS isolation is inferred, but not depicted).
[5] App2 must acquire a row lock on this row to essure to read the currently committed version of the row. It cannot acquire the row lock because the uncommitted in-flight transaction (App1).
[6] App2 uses the pointer to the original (currently committed) version of the row within the row lock information to retrieve the row from the recovery log stream.

In Db2 pureScale environments Version 11.1 and prior, support concurrently committed semantics was limited, a CS isolation row reader could bypass an in-flight row updater and retrieve the currently committed version of a record from the recovery log stream only when the row reader and row updater (lock holder) resided on the same member.  When the row updater was not on the same member as the row reader, the row reader would default to a lock wait condition, as depicted here:

[1] A user (App1) on pureScale member1 requests to update a row on table Tab1.
[2] App1 is granted a row lock on this row in exclusive (X) mode. This lock is effective in both the member local lock manager, as well as the global lock manager.
[3] App1 updates the row in the table, and a log-record is written into the member-local recovery log stream (this log-record contains the original version of the row).  A pointer to the original version of the row in the recovery log stream is stored within the row lock information structure (LRB). App1 has not yet committed its unit-of-work, so this row change is still in-flight.
[4] Another user (App2) on pureScale member2 requests to read this same row (CS isolation is inferred, but not depicted).
[5] App2 must acquire a row lock on this row to ensure to read the currently committed version of the row. It cannot acquire the row lock because an uncommitted in-flight transaction. 
[6] App2 cannot utilize the pointer to the original (currently committed) version of the row within the row lock information to retrieve the row, because this information only exists on member1. App2 must wait until App1 commits and releases the row lock, before proceeding.

Starting in Version 11.5, a CS isolation row reader in a Db2 pureScale environment is capable of retrieving the currently committed version of a record was when either the row reader and row updater (lock holder) resided on the same member, or on different members, as depicted here:

[1] A user (App1) on pureScale member1 requests to update a row on table Tab1.
[2] App1 is granted a row lock on this row in exclusive (X) mode. This lock is effective in both the member local lock manager, as well as the global lock manager.
[3] App1 updates the row in the table, and a log-record is written into the member-local recovery log stream (this log-record contains the original version of the row).  A pointer to the original version of the row in the recovery log stream is stored within the row lock information structure (LRB). App1 has not yet committed its unit-of-work, so this row change is still inflight.
[4] Another user (App2) on pureScale member2 requests to read this same row (CS isolation is inferred, but not depicted).
[5] App2 must acquire a row lock on this row, in order to ensure to read the currently committed version of the row. It cannot acquire the row lock because an uncommitted in-flight transaction. 
[6] App2 cannot utilize the pointer to the original (currently) version of the row within the row lock information to retrieve the row, because this information only exists on member1.
[7] App2 communicates with Member1 (via an RPC request) to retrieve the original (currently committed) version of the row using the row lock information on member1. The row is returned back to member2.

The enhancement is enabled by default starting in Version 11.5.




I hope this information is useful. If you have any thoughts or questions, feel free to submit a comment below.

No comments:

Post a Comment

Translate