Friday 13 October 2017

Don't Get Caught Unprepared - Monitor Invalid Tablespaces on your HADR Standby database and Repair Them Efficiently.

- David Sciaraffa, Software Development Manager – IBM Db2 Availability



The Db2 High Availability Disaster Recovery feature, better known as HADR, is a database replication solution for disaster recovery. There are few database administrators in the world unfamiliar with HADR, being one of the most successful and widely deployed Db2 features.


While application changes are not required to deploy HADR, and up-and-running is quick and simple, database administrators should accommodate the extra monitoring required to assure the health of the standby database. The IBM Developerworks HADR Wiki provides excellent insight into the nuances of monitoring HADR databases, covering areas of availability and performance. However, while an HADR standby database may be online and available, one or more tablespaces may be offline and unavailable. This condition has caught many database administrators by surprise after a TAKEOVER HADR operation.


An HADR standby database is in continuous replication with it's primary database by replaying transaction log data. If an error occurs during the replay of this log data on the standby database, such as a filesystem full condition or error, a tablespace on the Standby database will be put into an error state and log replay will stop for this tablespace. The Primary database will be unaware of this condition on the standby database, and applications accessing the primary database will continue on as business as usual. Because of this assumed health of the standby database, a subsequent Takeover operation results in the unexpected unavailability of tablespace data when this standby database becomes the primary database.


Monitoring for tablespaces in erroneous states on the Standby database was a little cumbersome prior to Db2 Version 10.5 fixpack9 or v11.1.1.1, as it requiring examining "db2pd -tablespaces" output on the standby database system and checking for an abnormal tablespace state value. Starting in Version 10.5 fixpack 9 or v11.1.1.1, monitoring this condition became much easier, as a new flag 'STANDBY_TABLESPACE_ERROR' was added to the HADR_FLAGS field of db2pd -hadr (or the MON_GET_HADR() table function) directly on the Primary database.


For more details about monitoring this error condition, please see new technote Monitoring and identifying tablespaces in invalid or error state, or tables in Inoperative state on the HADR Standby database.


Now that we've discussed the importance of monitoring for this condition, you're probably wondering how to best resolve the issue.

Prior to Version 10.5 fixpack9 or v11.1.0.0, resolving the issue required a full re-initialization of the standby database (ie. a full database backup image of the primary database must be restored on the standby database and hadr restarted).  For large databases this could take a considerable amount of time, and leave the primary database exposed without a disaster recovery counterpart.

Starting in Version 10.5 fixpack 9 (non-pureScale only) or v11.1.0.0, resolving the issue requires only the re-initialization of the erroneous tablespace(s).  For example, a tablespace backup image on the primary database can be restored on the standby database and hadr reactivated.  A full database backup restore is not required.  For large databases this can help to drastically reduce the time required to reestablish HADR disaster resilience.


For more details on how to recover from this condition using a tablespace backup image, please see technote How to recover from tablespace errors on an HADR Standby database.



I hope this information is useful. If you have any thoughts or questions, feel free to submit a comment below.



Sunday 1 October 2017

Improve Db2 HADR Resilience of Network Transmission Corruptions

- David Sciaraffa, Software Development Manager – IBM Db2


In systems configured for disaster resilience through the High Availability Disaster Recovery (HADR) feature, a Primary database is coupled with one or more Standby databases. The Primary database will send transaction log data across the network to the Standby database(s), where this log data is replayed, allowing the Standby database(s) to stay synchronized with any changes that are made on the Primary database. HADR is one the most popular features of Db2, because of it's simplicity of setup, and requiring no application changes.

In order to achieve balance between data protection and performance, HADR supports several levels of synchronization.

HADR synchronization/protection levels:
SYNC: provides the highest level of protection. A transaction commit operation on the Primary database only returns back to the application if it’s transaction log data has been successfully written to disk on the Standby database. [This level usually has a noticeable performance impact versus other protection levels].
NEARSYNC: provides the next best level of protection. A transaction commit operation on the Primary database only returns back to the application if it’s transaction log data has been successfully received in memory on the Standby database.
ASYNC: A transaction commit operation on the Primary database only returns back to the application if it’s transaction log data has been successfully placed on the Primary database side tcp port to the Standby.
SUPERASYNC: A transaction commit operation on the Primary database always returns back to the application regardless of the status of the transmission of log data to the Standby database.

Sync modes are well documented in the IBM Developerworks DB2 HADR Wiki.

HADR log data process flow:
The underlying protocol and process flow for communication of transaction log data within an HADR system utilizes parallel processes in order to achieve high performance, and is thus not trivial.

The diagram below (borrowed from the IBM Developerworks DB2 HADR Wiki) shows how a db2agent process (executing an application's sql operation) generates a transaction log record within an in-memory buffer of transaction log pages, one or more transaction log pages are asynchronously written to a transaction log file on the Primary database server by the db2loggw daemon, and in parallel these transaction log pages are sent over the network via tcp port by the db2hadrp daemon. The data is then received on the Standby database via the tcp port by the db2hadrs daemon, and then asynchronously written to a transaction log file on the Standby database by the db2loggw daemon, and in parallel sent to a db2shred daemon to parse and facilitate replay via other daemons on the Standby database.

hadr data flow diagram


Identifying network corruption errors:
The transmission of transaction log data across the network from the Primary to Standby database represents a potential point of failure. When networks are unstable or unreliable, this can manifest as the corruption of transaction log data or meta data during transmission.
The Standby database will typically fail with a "Log page checksum mismatch. pageLso xxxxxx, CheckSum xxxxxx" db2diag.log error message in Version 10.5 Fixpack 8 or earlier fixpacks; or "Bad page detected. The bytecount is incorrect" and "Bad page detected. Checksum mismatch…" errors in newer fix packs.

Network corruption errors and Crash Recovery failures on Standby database:
Some integrity checking of transaction log pages on the Standby database is always performed when the log pages are parsed by the db2shred daemon. While this does provide notification of integrity failure, it is possible that corrupted transaction log data can be written into the transaction log file on the Standby database before the integrity check failure is detected by the db2shred daemon. When the Standby database fails due to the transaction log page integrity check failure it will generate db2diag.log errors (as described above) and terminate. A subsequent restart of the Standby database (and implicit crash-recovery) will fail due to the same condition. To recover from this condition, typically a database administrator would need to either copy valid transaction log files from the Primary database, or reinitialize the Standby database.


Adding resilience to network corruption errors:
Starting in Version 10.5 Fixpack 9 and Version 11.1.3.3, more aggressive integrity checking can be enabled on the Standby database, along with automatic retry logic.

By enabling the DB2_ENABLE_HADR_LOG_PAGE_INTEGRITY_CHECK registry variable, integrity checking of transaction log data is performed by the Standby database db2hadrs daemon upon receiving log pages from tcp port, before they are written to log files by the db2loggw daemon or parsed by the db2shred daemon. When an integrity check failure is detected,  an error messages is printed into the db2diag.log (as described above), and the db2hadrs daemon will automatically attempt to have these log pages retransmitted from the Primary database.  This happens seamlessly from the application or administrator.

For more details on this new feature, please see:

Technote#2008073 - Improve Db2 HADR Resilience Through Network Data Integrity Checking and Retransmission



I hope this information is useful. If you have any thoughts or questions, feel free to submit a comment below.


Monday 18 September 2017

Understanding Fallback Compatibility of Db2 Version 11.1 Fix Packs and Mod Packs

- David Sciaraffa, Software Development Manager – IBM Db2 Availability


Administration of most software systems require the periodic update of software levels. This will usually come as the result of proactively applying preventative fixes; the desire to use a new feature; or re-actively because an urgent fix is needed.

In Db2, software updates are provided in various ways, and an experienced database administrator will be familiar with each.


Db2 Software Delivery Vehicles

  • Version

A Db2 version or release is a momentous collection of new feature and functionality. Db2 Version 11.1 is the successor of Version 10.5, 10.1, 9.7, 9.5, 9.1, 8.2, 8.1, 7.2, 7.1, and so on. New Db2 versions are released on a periodic basis usually in the range of 18-32 months. Deploying new Db2 versions require an upgrade procedure to modify database meta data and catalogs. Fallback compatibility between Db2 versions is never possible (described later).

  • Modification Pack

A Db2 Modification Pack is also a collection of significant feature and functionality. However, it differs from a new version in that no upgrade procedure is required. Db2 modification packs are usually fallback compatible (described later).  At the time of this writing, the most recent modification pack of Db2 is "Version 11.1 Modification Pack 2 Fix Pack 2" (abbreviated as 11.1.2.2).  Prior to Version 11.1, Db2 did not use modification pack nomenclature.

  • Fix Pack and iFix

A Db2 Fix Pack is a collection of mostly stability fixes on a specific release or modification pack. The scope of these fixes are generally not significant and there is usually no new feature or functionality. Fix Packs are released on a periodic basis, usually approximately every 3-4 months for brand new releases, and stretching to approximately every 12 months for older mature releases. Prior to version 11.1 (before Db2 adopted the modification pack nomenclature), some Fix Packs did contain significant feature and because of this may not have been fallback compatible (described later).

An iFix is a type of Fix Pack, however the fixes included in an iFix are limited to those of highest impact and minimal risk.

  • Special Builds

A Db2 special build is a Db2 delivery vehicle used to include one or few fixes for problems which a customer has encountered. Often these fixes are new and not yet available in a fix pack. Special builds are usually not provided on a proactive basis. Special builds are built on top of the fixpack level which the customer has deployed, and often cumulatively built on top of an existing special build which was previously provided to the customer.

Special builds may also come in the form of 'Debug builds' which contain small changes to assist with the root cause determination of a problem, and are usually only temporarily deployed.


Fallback Compatibility

Planning for the deployment of a new version, modification pack, or fix pack will require a back-out strategy or plan. Once a database is running on a newer version/mod/fix pack, consideration is needed for whether the database data structures, backup images, and transaction log files are still compatible with the previous version/mod/fix pack.

A detailed technote has been published to describe the many nuances of fallback compatibility, and should help in the assembly of a back-out strategy:

Compatibility between DB2 for LUW Version 11.1 Mod-Packs and Fix-Packs:
http://www-01.ibm.com/support/docview.wss?uid=swg22003131



I hope this information is useful. If you have any thoughts or questions, feel free to submit a comment below.

Friday 1 September 2017

Avoiding Lock Escalations and Improving Application Concurrency

- David Sciaraffa, Software Development Manager – IBM Db2 Availability



The concept and purpose of a database 'lock' should be well understood by all database administrators and application developers. Locks allow for multiple applications to access and manipulate database data at the same time.  Locks prevent race-conditions and provide the I in an A.C.I.D. compliant database - Isolation.

Locks, being conceptual in nature, come in many different forms and vary by database architecture. Most enterprise databases will have in common two fundemental types of locks: row lock and table lock. When a row within a table is being inserted, updated, deleted, or read, a row lock may be acquired. A lock on the entire table can also be acquired for some operations.  An application will hold locks for the duration of a unit of work, until a commit or rollback operation is performed (the unit of work is thus ended), at which point locks are released.


Isolation Levels

When a row is read from a table by an application, the isolation level of the application determines the type of lock required. The isolation level allows the application to dictate how data changes are isolated from other applications. For example, suppose an application A is inserting rows into a table but has not yet committed it's changes (meaning that it could cancel and rollback these changes at any time), while application B is reading rows from this table -- should application B be allowed to read the uncommitted data that is being inserted by application A?  Isolation levels allow the application to configure this behaviour, and more.

Isolation levels are well documented in the Db2 Knowledge Center. This developerworks article provides an excellent overview of isolation levels: https://www.ibm.com/developerworks/data/library/techarticle/dm-0509schuetz/


Finite Memory for Locks

Although locks are conceptual in nature, they are physically manifested within the database because a portion of database memory is used to manage and track locks.

The LOCKLIST configuration parameter is used to configure the quantity of database memory to use for managing and tracking locks.  By default this parameter is tuned by the Self Tuning Memory Manager and will increase and decrease with usage.

Another configuration parameter, called MAXLOCKS, allows the database administrator to configure the maximum percentage of locklist memory that a single application can use. This is meant as a safeguard to prevent a single untuned sql operation from consuming all the locklist memory.

Since locklist memory is finite in size, the database manager is only capable of managing a finite quantity of locks. So, how should the database behave when locklist memory is exhausted and no more locks can be acquired to service applications? Db2's default behaviour is to free locklist memory by performing lock 'escalation'.


Lock Escalation

Lock escalation is the process by which many row locks on the same table and associated with an application connection, are released and replaced by a single table lock associated to that application connection. Thus the locklist memory which was previously used by the many row locks becomes available for reuse.

While lock escalation is beneficial in that it allows the database to continue to grant new locks, it also has a negative side affect. By escalating to a single table lock associated to that application, other applications may no longer be able to access rows on this table (depending on their configured isolation levels).

For this reason, the frequency of lock escalations are monitored by experienced database administrators, and tuning is performed.  Lock escalations can be monitored by querying the LOCK_ESCALS value of the MON_GET_ACTIVITY() table function (as well as many other granular table functions).


Common Causes of Lock Escalation

The possible causes of lock escalations are easily described:

- When an application inserts/updates/deletes a very large quantity of rows in a single unit of work, a lock may be required for every row and this can exhaust locklist memory. Applications should never perform large batch I/U/D operations like this in a single unit of work, and should instead perform a commit operation after a small set of rows is processed, repeatedly. The lock_count value of the MON_GET_LOCKS() table function can be monitored periodically to identify any applications which hold an large quantity of locks.

- When an application is using a strict isolation level (such as 'Repeatable Read (RR)'), a lock will be required for every row read by that application. For queries of large data sets, this can exhaust locklist memory. Tuning options must be explored (such as splitting the query operation into multiple smaller queries, or having the application explicitly acquire a table lock ahead of time, or reducing the isolation level of the application if that acceptable from business perspective).  The lock_count value of the MON_GET_LOCKS() table function can be monitored periodically to identify any applications which hold a large quantity of locks.

- When locklist memory is not tuned by the Self Tuning Memory manager, it may simply be too small for the database workload. If the database administrator is not able to identify any applications which fit the two scenarios described above, then increasing the amount of locklist memory may be required. Estimating the ideal quantity of locklist memory is tricky and requires a deep understanding of application workloads and queries.  This technote (http://www-01.ibm.com/support/docview.wss?uid=swg21600309) may be beneficial in helping to make that estimate. However, usually when lock escalations occur due to locklist memory exhaustion, a database administrator will simply increase the locklist memory by some factor (say 25%-100%) and then continue to monitor the number of lock escalations.

- When the MAXLOCKS configuration value is configured extremely small, lock escalations will occur.  Since the intent of this configuration parameter is as a safeguard against runaway un-tuned queries, we generally see this value configured between 50% and 70%.


Forcing Lock Escalation Avoidance

When lock escalation problems are observed, a database administrator should always explore the cause and look for any tuning opportunities, as described earlier. Sometimes, queries and workloads are adhoc and beyond the control of the database administrator. In such scenarios, lock escalations are unpredictable and difficult to tune.

Starting in Db2 Version 11.1.2.2, a new feature called 'DB2_AVOID_LOCK_ESCALATION' provides control over lock escalation behaviour.  The feature is configured through a registry variable:

        db2set DB2_AVOID_LOCK_ESCALATION=ON
       
(the database or database manager does not require a restart for this setting to take effect).

When this feature is enabled and an application encounters the conditions where lock escalation would normally occur (ie. locklist memory is full, or MAXLOCKS limit reached) instead of performing lock escalation the application will receive an SQL0912N error. The application then has an opportunity to either perform a COMMIT or ROLLBACK to release the locks held by this application."

For more information on lock escalation avoidance, please see:

https://www.ibm.com/support/knowledgecenter/en/SSEPGG_11.1.0/com.ibm.db2.luw.admin.regvars.doc/doc/r0005665.html#P_DB2_AVOID_LOCK_ESCALATION



I hope this information is useful. If you have any thoughts or questions, feel free to submit a comment below.

Thursday 3 August 2017

Access your database sooner after a system failure -- introducing DB2_ONLINERECOVERY!

- David Sciaraffa, Software Development Manager – IBM Db2 Availability


System failures happen. Modern enterprise IT stacks are complex feats of software and hardware engineering, encompassing billions of physical transistors and billions of lines of logical code comprising an almost infinite number computational permutations. These stacks run in data centers with physical attributes such as electrical utility and temperature conditions. Lest we forget the most vulnerable component of the IT stack - human error. Thus, system failures happen.



An enterprise database must be extremely resilient to system failures. Data loss or corruption are impermissible. Db2 has a distinguished history of resilience to system failures. It is truly best of breed. I'm proud to manage a team of talented developers focused on database recovery and availability.

Db2 recovery from a system failure begins when the database is restarted, and utilizes a process called 'crash recovery'. During crash recovery, any database changes which were previously committed by the application but not yet persisted to disk, are re-applied (we refer to this as the Forward-phase of crash recovery). Next, any database changes which were persisted to disk but not yet committed by the application, are undone (we refer to this as the Backward-phase of crash recovery). The Db2 transaction logs are processed to facilitate all this work.

Crash recovery processing is not isolated to system failures. In a database configured for disaster recovery via HADR, a TAKEOVER HADR operation on the Standby database must also perform crash recovery in order to undo any changes which were persisted to disk but not yet committed to the application.

While a database is performing crash recovery, it is not connectable by any application.

While the process of crash recovery provides database atomicity and durability in the event of a system failure, the process can take time to complete. For well tuned  applications crash recovery usually completes in a reasonable amount of time (ranging from near instantaneous to minutes) because the quantity of transaction log data to be processed is small. For poorly tuned applications, the quantity of log data to be processed can be large, and thus crash recovery can take a significant amount of time.


Well tuned applications and databases have some common themes:

- After changing data (like inserting, updating, or deleting) these applications perform a transaction commit operation expediently. By not leaving transactions open for a long time, this has the effect of reducing the total quantity of log data that needs to be processed if a crash recovery is required.

- Batch data changes are constrained to small manageable sizes. For example, if 1 million records need to be inserted into the database during a nightly batch, the application will perform this in smaller units of work of say 1 thousand records and then perform a transaction commit operation after each unit of work. By not generating single huge units of work, we reduce the quantity of transaction log data that needs to be processed if a crash recovery is required.

- The database is tuned to flush data changes to disk frequently. This is accomplished by configuring the PAGE_AGE_TRGT_MCR configuration parameter to a reasonable size in minutes. Since flushing data changes too frequent may overwhelm the system of i/o for heaving workloads, tuning this parameter will often require monitoring and adjustment. The DB2_USE_ALTERNATE_PAGE_CLEANING registry variable is a popular method of increasing the frequency of flushing data changes.

- Strict limitations are enforced on the size of transactions using the NUM_LOG_SPAN and MAX_LOG configuration parameters.

- The database is monitored proactively to identify any poorly tuned applications. The log_to_redo_for_recovery value of the MON_GET_TRANSACTION_LOG() table function can help to estimate the quantity of transaction log data required to process during crash recovery. The log span of uncommitted transaction can be monitoring using the db2pd-transactions, as described in the 'Determine the Size and Span of Uncommitted Transactions' section of this technote: http://thinkingdb2.blogspot.ca/2015/07/how-to-estimate-duration-of-takeover.html


While system failures seldom happen, most customers expect crash recovery to complete quickly. Since a database administrator can never fully prevent poorly tuned applications, a common customer request is for quicker accessibility to the database after a system failure, during crash recovery.

Our development team decided upon a solution which promised increase availability, without risking the distinguished resilience that Db2 has achieved over the decades.

The DB2_ONLINERECOVERY feature was rolled out in Db2 Version 11.1.2.2.

The feature allows for the database to become connectable once the forward phase of crash recovery has completed, while the backward phase of crash recovery is progressing. Tables which contain data changes that need to be undone during the backward phase of crash recovery will not be accessible to queries until the last of their data changes is undone.  All other tables are fully accessible to queries throughout the backward phase of crash recovery.

For more detailed information about this cool new feature, please see:

Database accessibility during backward phase of crash recovery or HADR takeover
https://www.ibm.com/support/knowledgecenter/SSEPGG_11.1.0/com.ibm.db2.luw.admin.ha.doc/doc/c0070281.html



I hope this information is useful. If you have any thoughts or questions, feel free to submit a comment below.

Wednesday 12 July 2017

Reduce Problem Resolution Times with Persistent Db2 Diagnostic Data Collection

- David Sciaraffa, Software Development Manager – IBM Db2 Availability


In common lore spread throughout software engineering classes, the term ‘bug’ (referring to a software defect) was coined after a moth embedded itself within an electromechanical computer relay sometime during the infancy of computing. True or not, the almost infinite permutations of timing and procedural conditions in modern software stacks mean that software defects (bugs) are inevitable.



All software, including enterprise relational databases, are susceptible to software defects. The IBM Db2 support and development teams provide excellent assistance for a wide range of support issues, usage questions, best practices, and of course software bugs.

The complexity of software bugs will vary, and the more complex issues will often require specific diagnostic data leading up to the problem occurrence.  This means that, quiet often, there is insufficient diagnostic data during the first occurrence of a problem to determine the root cause, and the reproduction of a problem is required with diagnostic data collection in place for the time frame leading up to the problem.


The Db2 Persistent Data Collection Scripts help to reduce problem determination times, and often reduce the need for problem reproductions.

The scripts are available for download here: http://www.ibm.com/support/docview.wss?uid=swg22005477


In the spirit of agility, the scripts are simple ksh scripts, operational on both Linux and AIX platforms. They collect various diagnostics from both the Operating System, and from Db2. The Db2 diagnostics come from negligible-impact MON_GET* functions, and the db2pd tool (which is functional even when database connectivity is not possible).

The scripts are used by several extremely high transaction volume production database banking systems, with no performance impact.

There are three scripts in total:

Db2_dataCollect_minutely.sh – You should use a crontab or scheduler to execute this script about every minute or two. 
The diagnostic data collected by this script is stored into a staging folder and is automatically tarred and compressed (gzip) upon completion (with exceptional compression ratios, so data volumes are manageable).

Db2_dataCollect_hourly.sh – You should use a crontab or scheduler to execute this script about every hour. 
The diagnostic data collected by this script is stored into a staging folder and automatically tarred and compressed (gzip) upon completion (with exceptional compression ratios, so data volumes are manageable).

Db2_dataCollect_archive_purge.sh – You should use a crontab or scheduler to execute this about script every 24hours.
This script will move the diagnostic data from the staging folders (where the collection scripts above store their diagnostic data) into an archive folder. It will then delete any files older then a retention period (default 30 days) from the archive folder.


In future blog posts, I hope to provide details on how database administrators can utilize the diagnostic data collected by these scripts.  Seasoned DBAs will be familiar with much of the information collected here and find it useful for self analysis.  Db2 support analysts will find it extremely useful in the triage of a multitude of problems.




Your feedback is always welcome. If you have any suggestions, corrections or recommendations, or would like to monitor for updates to these scripts, please subscribe to the public community for Db2 Persistent Data Collection Scirpts:
https://www.ibm.com/developerworks/community/groups/service/html/communitystart?communityUuid=a294bddb-9ded-4757-8d5e-ca5ba173e2cd

Translate