ThinkingDb2!: Enabling Db2 Historical Monitoring (db2histmon) - continuous diagnostic data collection and archival

- David Sciaraffa, Software Engineering Manager – IBM Db2

When a problem with Db2 arises, identification of the problem's root cause by IBM Db2 support and development is often directly influenced by the availability of diagnostic information before, during, and after the time-frame of the problem event. When the diagnostic messages in the db2diag.log are insufficient, it is not uncommon for IBM Db2 support or development to provide instructions to enable a specific set of diagnostic data, and wait for a re-occurrence of the problem, in order to begin narrowing the root cause.

The new Db2 Historical Monitoring (db2histmon) scripts (available here) collect a broad range of Db2 and Operating System diagnostic data, and retain this data for a period of time, allowing for basic triage of many types of issues. Diagnostic information is collected at various intervals (ranging from 1min to 1hour, depending on the kind of data), and can be easily customized and expanded. The scripts will collect data only while the database is activated, which can help to reduce the generation of unnecessary redundant data.

By enabling the collection of Db2 Historical Monitoring (db2histmon), you can improve the odds that helpful diagnostic information is available after the first occurrence of a problem.

This blog provides an overview of how to enable and customize the new Db2 Historical Monitoring (db2histmon) scripts.

Downloading the Db2 Historical Monitoring (db2histmon) scripts:

The scripts are available and can be downloaded from the open-source Github repo: https://github.com/IBM/db2histmon

Pre-requisites:

The scripts can be deployed in any Unix or Windows environment where Db2 is supported.
Python2 or higher, as well as the python ibm_db module are required to execute the scripts.
The scripts can be deployed on any relatively modern version of Db2 (Version 10.1, 10.5, 11.1, 11.5).
Db2 DBADM authority is required on the database where the historical monitoring will be deployed.
A C++ compiler (for example g++ on Unix or Visual Studio on Windows) is required (in order for the scripts to use bldrtn to compile the external C++ UDFs which are used by the historical monitoring framework).

Overview of the Db2 Historical Monitoring framework:

The historical monitoring scripts will utilize a set of control tables, procedures and external UDFs to operate, with the DB2 Admin Task Scheduler executing the data collection tasks at the desired frequency (minimum 1min), and will generate diagnostic/monitoring output into the active data collection folder first, and then archived into the data archival folder every hour. Consider the following process flow diagram:

Consider the following high-level process flow diagram:

Setup Folder:

After downloading the scripts to your local database server system, you'll notice three folders, 1_setup, 2_loader, and 3_quickparser:

(db2inst1@hotellnx112) /home/db2inst1/db2histmon-master/
$ ls

1_setup/ 2_loader/ 3_quickparser/

For the purpose of this blog, we'll be focusing on the 1_setup/ folder (which is used to setup and deploy the historical monitoring framework and begin data collection). In a subsequent blog I will provide an overview of the 2_loader script (which is used to easily load the data collections into a database so it can be queried), and the 3_quickparser script (which is used to display the data collections to a terminal window in a columnar format for easier viewing).

Within the 1_setup/ folder, we find three files, and sub-folder named sql/

(db2inst1@hotellnx112) /home/db2inst1/db2histmon-master/1_setup/
$ ls

README.md setup.py sql/ task_details.json

The README.md file contains a technical description of the setup scripts, input options, as well as a brief architectural overview of how the scripts work. The purpose of this blog is to expand on this information with some visual examples.
The setup.py file is the main setup script, described in detail below.
The task_details.json file, described in detail below, contains the definition of each data collection task, and is used by the setup.py script to deploy each task.

Within the 1_setup/sql/ folder, we find the definition of the control tables, procedures, and user-defined functions which the setup.py script will create in the specified database. The details of these files would mostly be useful to developers or administrators who wish to explore and expand on the framework. I do not explore them in detail in this blog.

(db2inst1@hotellnx112) /home/db2inst1/db2histmon-master/1_setup/sql
$ ls

external.C

external.exp

external.sql

proc_collect.sql

tab_dirs.sql

tab_tasks.sql

external.def

external.h

proc_archive.sql

tab_config.sql

tab_errs.sql

Overview of the task_details.json file:

The task_details.json file defines each data collection task, and is used by the setup.py script to deploy each data collection task into the Db2 Admin Task Scheduler.

The default task_details.json file that is included with the scripts looks like so:

(db2inst1@hotellnx112) /home/db2inst1/db2histmon-master/1_setup/
$ less task_details.json

[
{
"collection_name": "ENV_GET_SYSTEM_RESOURCES",
"collection_class": "SQL",
"collection_command": "SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE(SYSPROC.ENV_GET_SYSTEM_RESOURCES()) T",
"collection_freq": "* * * * *",
"collection_level": "1",
"collection_condition": "",
"loader_join_columns": "",
"loader_diff_exempt_columns": "ALL",
"quickparse_summary_columns": "ALL"
},

...etc...

...etc...

{
"collection_name": "VMSTAT",
"collection_class": "SYS",
"collection_command": "vmstat",
"collection_freq": "* * * * *",
"collection_level": "1",
"collection_condition": "UNIX",
"loader_join_columns": "",
"loader_diff_exempt_columns": "",
"quickparse_summary_columns": ""
}

The "collection_name" defines the name of the collection task (this name will be used later to generate the filename into which this data collection will be output, so avoid using spaces or special characters).
The "collection_class" can be either "SYS" or "SQL". If this task is defined as "SYS" then the collection_command is invoked as a system call to the operating system. If this task is defined as "SQL", then the collection_command is invoked as an sql query against the database.
The "collection_command" is the operating system or sql query which generates the desired monitoring or diagnostic information.
The "collection_freq" defines the frequency of data collection, and this is specified in cron format. The maximum frequency is every 1 minute.
The "collection_level" can be either "1" or "2". When the setup.py script is executed to deploy these data collection tasks to the database, the administrator has the option of deploying only a small set of basic data collection (which are those tasks defined with a collection_level value of 1), or an expanded set of data collection (which are those tasks defined with a collection_level value of 2).
The "collection_condition" is used to define the environmental condition under which this data collection task should run. As of the time of this writing, the possible values are "HADR", "PURESCALE", "UNIX", "WINDOWS", or nothing. For example, if a task is defined with a collection_condition value of "HADR", then the task is only executed if an db2 HADR environment is detected. If the collection_condition is nothing, then the task is always executed.
The "loader_join_columns", "loader_diff_exempt_columns", and "quickparse_summary_columns" will be the focus of a subsequent blog.

setup.py usage and arguments:

Use the setup.py --help option to display usage:

(db2inst1@hotellnx112) /home/db2inst1/db2histmon-master/1_setup/
$ python3 setup.py --help

usage: setup.py [-h] [-un] [-pw] [-c] [-uc] [-ut] [-bp] [-cp] [-ap] [-ms]
[-acmd] [-aext] [-lvl]
database

Setup IBMHIST monitoring program. View README for more information.

positional arguments:
database name of database

optional arguments:
-h, --help show this help message and exit
-un , --username username used to connect to database, default is that
of current user
-pw , --password password used to connect to database, default is that
of current user
-c, --cleanup will drop the IBMHIST schema and all of its objects,
and unschedule its tasks
-uc, --update_config_only
will update config values to passed in arguments or
reset to defaults for those not explicitly specified
-ut, --update_tasks_only
will unschedule all old tasks and reschedule to match
with the task_details.json file
-bp , --bldrtn_path specify path for bldrtn script if it is not
automatically found, default:
/home/<current_user>/sqllib/samples/cpp/bldrtn
-cp , --coll_path directory path of data collection, default:
/home/<current user>/sqllib/db2dump
-ap , --arch_path directory path of data archival, default:
/home/<current user>/sqllib/db2dump
-ms , --max_size max size in bytes of collection and archival, default:
1073741824
-acmd , --arch_cmd command used to archive hourly folders with "_src_"
and "_dest_" placeholders, default: tar -caf _dest_
_src_
-aext , --arch_ext extension of archived hourly folders beginning with
".", default: .tar.gz
-lvl , --coll_lvl scope of tasks to schedule (1 will only schedule key
tasks, 2 will schedule more), all tasks scheduled if
unspecified

The [database] name parameter is mandatory and specifies the local database against which the historical monitoring setup will be deployed. If you have multiple databases in your database instance, and wish to deploy historical monitoring on other these databases, you must perform the setup.py script on each one separately.
The --username and --password arguments are used to establish the database connection. These are required if your database is configured to disallow local user IPC connections without a password. (If unspecified, a connection to the database using the current user is used).
The --cleanup option will perform a full removal of all previously created IBMHIST control tables, procedures, and external UDFs, and data collection tasks. Any previous data collection files or archived sub-folders will not be removed.
The --update_config_only option will perform only the creation (or re-creation) of the IBMHIST control tables, procedures, and external UDFs, and configure (or re-configure) any options/arguments that are specified. Note that default values are used for any options/arguments which are not specified, even if these arguments were specified during a previous invocation of the script.
The --update_tasks_only option will perform only the removal of any previously deployed data collection tasks, and will parse of the task_details.json file and (re)deploy all data collections tasks.
(Note, when neither the --update_config_only or --update_tasks_only options are specified, then both operations are always performed).
The --bldrtn_path optional argument specifies the fully qualified path to the sqllib/samples/cpp/bldrtn executable. This is required to compile the external C++ UDFs which perform the collection of operating system 'SYS' data collection tasks. (If unspecified, the default path /home/<current_user>/sqllib/samples/cpp/bldrtn is used).
The --coll_path optional argument specifies the fully qualified path where a data collection sub-folder will be created (every hour) and the monitoring/diagnostic data will be placed into. (If unspecified, the default path /home/<current_user>/sqllib/db2dump/ is used).
The --arch_path optional argument specifies the fully qualified path to the folder where the data collection folder (every hour) will be moved after it is removed from the active coll_path. (If unspecified, the default path /home/<current_user>/sqllib/db2dump/ is used).
The --max_size optional argument specifies the maximum size (in bytes) of archival data to retain within the arch_path. When this limit is breached, hourly data collection folders (starting with the oldest) will be deleted. (If unspecified, the default value 1073741824 bytes (1GB) is used).
The --arch_cmd optional argument specifies the operating system command that is used (every hour) to tar/zip/compress the active data collection folder into the archive folder. A filename (in the format '<dbname>_<yyyy><mm><dd><hh>') is automatically generated and should not be specified. Tokens '_src_' and '_dest_' are used as placeholders in place of the coll_path and arch_path respectively. (If unspecified, the default command 'tar -caf _dest_ _src_' is assumed). For example, if one wanted to use the star unix command, instead of the default tar command, one might specify 'star -c -f=_dest_ _src_'.
The --arch_ext optional argument specifies the extension to tack onto the archived folder's file after the arch_cmd is completed. (If unspecified, the default extension ".tar.gz" is used).
The --coll_level optional argument specifies the level of data collections. If "1" is specified, only the data collection tasks in the task_details.json file with a "collection_level" value of 1 are deployed. If "2" is specified, any data collection tasks with a "collection_level" value of 1 or 2 are deployed.

Example of running the setup.py script:

Here, we execute the setup.py script against database TESTDB1, using all default arguments:

<db2inst1@hotellnx112> /home/db2inst1/db2histmon-master/1_setup/
$ python3 setup.py testdb1

First, we see the creation of the SYSTOOLSPACE table space (if not already existing), and the determination of environment conditions (UNIX/WINDOWS, HADR, and/or PURESCALE):

Connecting to database: TESTDB1
Creating tablespace SYSTOOLSPACE ...
Determining environment ...
Determining if Windows or Unix ...
Unix detected ...
Determining if pureScale ...
pureScale not detected ...
Determining if HADR ...
HADR not detected ...

Next we see the unscheduling of any previously scheduled tasks:

Unscheduling tasks ...

Next we see the creation (or recreation) of the IBMHIST schema, the creation of the IBMHIST control tables, procedures, and compilation of the external UDFs.

Setting up IBMHIST schema and its objects ...
Creating IBMHIST schema ...
Setting schema to IBMHIST ...
Registering external functions ...
Copying bldrtn script from '/home/db2inst1/sqllib/samples/cpp/bldrtn' to sql/ ...
Building external functions ...
Executing external.sql ...
Registering tables ...
Executing tab_config.sql ...
Executing tab_errs.sql ...
Executing tab_tasks.sql ...
Executing tab_dirs.sql ...
Registering procedures ...
Executing proc_collect.sql ...
Executing proc_archive.sql ...
Removing additional files created during setup ...

Next we see the configuration of settings based on setup.py input arguments (or default values as in this example):

Configuring IBMHIST settings ...
Deleting configurations from IBMHIST.TAB_CONFIG ...
Setting COLL_PATH to '/home/db2inst1/sqllib/db2dump' ...
Setting ARCH_PATH to '/home/db2inst1/sqllib/db2dump' ...
Setting MAX_SIZE to '1073741824' ...
Setting ARCH_CMD to 'tar -caf _dest_ _src_' and ARCH_EXT to '.tar.gz' ...
Setting TASK_DETAILS_PATH to '/home/hotellnx112/db2inst1/db2histmon-master/1_setup/task_details.json' ...

Next we see some validation tests and sanity checks:

Testing IBMHIST functionality ...
Testing IBMHIST.PROC_COLLECT with a dummy SQL task ...
Testing IBMHIST.PROC_COLLECT with a dummy SYS task ...
Testing IBMHIST.PROC_ARCHIVE to archive collection directories ...
Testing IBMHIST.PROC_ARCHIVE to delete archival directories once max size is reached ...

Next we see the parsing of the task_details.json file, and the registration of data collection tasks to the Db2 Admin Task Scheduler:

Scheduling tasks ...
Setting Admin Task Scheduler registry variable ...
Reading from task_details.json file ...
Scheduling task: ENV_GET_SYSTEM_RESOURCES
Scheduling task: MON_GET_AGENT
Scheduling task: MON_GET_CONNECTION
Scheduling task: MON_GET_ACTIVITY
Scheduling task: MON_GET_UNIT_OF_WORK
Scheduling task: MON_CURRENT_SQL
Scheduling task: MON_GET_APPL_LOCKWAIT
Scheduling task: MON_GET_EXTENDED_LATCH_WAIT
Scheduling task: MON_GET_MEMORY_POOL
Scheduling task: MON_GET_MEMORY_SET
Scheduling task: MON_GET_TRANSACTION_LOG
Scheduling task: DB_GET_CFG
Scheduling task: DBMCFG
Scheduling task: ENV_INST_INFO
Scheduling task: ENV_GET_REG_VARIABLES
Scheduling task: MON_GET_EXTENT_MOVEMENT_STATUS
Scheduling task: MON_GET_GROUP_BUFFERPOOL
Scheduling task: MON_GET_INDEX
Scheduling task: MON_GET_LATCH
Scheduling task: MON_GET_BUFFERPOOL
Scheduling task: MON_GET_PAGE_ACCESS_INFO
Scheduling task: MON_GET_PKG_CACHE_STMT
Scheduling task: MON_GET_REBALANCE_STATUS
Scheduling task: MON_GET_SERVERLIST
Scheduling task: MON_GET_TABLE
Scheduling task: MON_GET_TABLESPACE
Scheduling task: MON_GET_UTILITY
Scheduling task: MON_GET_WORKLOAD
Scheduling task: VMSTAT
Scheduling task: IOSTAT
Scheduling task: NETSTAT
Scheduling task: ARCHIVE

Collection will begin in approximately 5 minutes ...

Finally, the setup script is complete. Within about 5mins the Db2 Admin Task Scheduler should begin executing the data collection tasks, at their defined frequency.

Closing connection ...
Done

The Data Collection and Data Archive folders:

Approximately 5mins after executing the setup script, the Db2 Admin Task Scheduler should begin executing the data collection tasks, at their defined collection frequency.

Because I did not explicitly specify a --coll_path for the data collection, the default path within the database's DIAGPATH (usually ~/sqllib/db2dump/) is used, and a folder "IBMHIST_<dbname>" is created. Likewise for the --arch_path, a folder named "IBMHIST_<dbname>_archive" is created:

<db2inst1@hotellnx112> /home/db2inst1/sqllib/db2dump/
$ ls

db2ats_sps_bnd.log

db2diag.log
db2inst1.nfy
IBMHIST_TESTDB1/
IBMHIST_TESTDB1_archive/
stmmlog

Within the data collection folder, we can see that an hourly data collection sub-folder has been created:

<db2inst1@hotellnx112> /home/db2inst1/sqllib/db2dump/IBMHIST_TESTDB1/
$ ls

TESTDB1_2020062714/

And within this hourly data collection sub-folder, we see the output of each of the data collection tasks, at their defined frequencies. In this example, I listed the directory contents approximately 9mins after the Db2 Admin Task Scheduler began executing the tasks:

<db2inst1@hotellnx112> /home/db2inst1/sqllib/db2dump/IBMHIST_TESTDB1/TESTDB1_2020062714/
$ ls

ENV_GET_SYSTEM_RESOURCES_202006271421.del MON_GET_MEMORY_SET_202006271421.del
ENV_GET_SYSTEM_RESOURCES_202006271422.del MON_GET_MEMORY_SET_202006271422.del
ENV_GET_SYSTEM_RESOURCES_202006271423.del MON_GET_MEMORY_SET_202006271423.del
ENV_GET_SYSTEM_RESOURCES_202006271424.del MON_GET_MEMORY_SET_202006271424.del
ENV_GET_SYSTEM_RESOURCES_202006271425.del MON_GET_MEMORY_SET_202006271425.del
ENV_GET_SYSTEM_RESOURCES_202006271426.del MON_GET_MEMORY_SET_202006271426.del
ENV_GET_SYSTEM_RESOURCES_202006271427.del MON_GET_MEMORY_SET_202006271427.del
ENV_GET_SYSTEM_RESOURCES_202006271428.del MON_GET_MEMORY_SET_202006271428.del
ENV_GET_SYSTEM_RESOURCES_202006271429.del MON_GET_MEMORY_SET_202006271429.del
IOSTAT_202006271421.del MON_GET_PAGE_ACCESS_INFO_202006271421.del
IOSTAT_202006271422.del MON_GET_PAGE_ACCESS_INFO_202006271422.del
IOSTAT_202006271423.del MON_GET_PAGE_ACCESS_INFO_202006271423.del
IOSTAT_202006271424.del MON_GET_PAGE_ACCESS_INFO_202006271424.del
IOSTAT_202006271425.del MON_GET_PAGE_ACCESS_INFO_202006271425.del
IOSTAT_202006271426.del MON_GET_PAGE_ACCESS_INFO_202006271426.del
IOSTAT_202006271427.del MON_GET_PAGE_ACCESS_INFO_202006271427.del
IOSTAT_202006271428.del MON_GET_PAGE_ACCESS_INFO_202006271428.del
IOSTAT_202006271429.del MON_GET_PAGE_ACCESS_INFO_202006271429.del
MON_CURRENT_SQL_202006271421.del MON_GET_REBALANCE_STATUS_202006271421.del
MON_CURRENT_SQL_202006271424.del MON_GET_REBALANCE_STATUS_202006271422.del
MON_CURRENT_SQL_202006271427.del MON_GET_REBALANCE_STATUS_202006271423.del
MON_GET_ACTIVITY_202006271422.del MON_GET_REBALANCE_STATUS_202006271424.del
MON_GET_ACTIVITY_202006271424.del MON_GET_REBALANCE_STATUS_202006271425.del
MON_GET_ACTIVITY_202006271426.del MON_GET_REBALANCE_STATUS_202006271426.del
MON_GET_ACTIVITY_202006271428.del MON_GET_REBALANCE_STATUS_202006271427.del
MON_GET_AGENT_202006271422.del MON_GET_REBALANCE_STATUS_202006271428.del
MON_GET_AGENT_202006271424.del MON_GET_REBALANCE_STATUS_202006271429.del
MON_GET_AGENT_202006271426.del MON_GET_SERVERLIST_202006271421.del
MON_GET_AGENT_202006271428.del MON_GET_SERVERLIST_202006271422.del
MON_GET_APPL_LOCKWAIT_202006271421.del MON_GET_SERVERLIST_202006271423.del
MON_GET_APPL_LOCKWAIT_202006271422.del MON_GET_SERVERLIST_202006271424.del
MON_GET_APPL_LOCKWAIT_202006271423.del MON_GET_SERVERLIST_202006271425.del
MON_GET_APPL_LOCKWAIT_202006271424.del MON_GET_SERVERLIST_202006271426.del
MON_GET_APPL_LOCKWAIT_202006271425.del MON_GET_SERVERLIST_202006271427.del
MON_GET_APPL_LOCKWAIT_202006271426.del MON_GET_SERVERLIST_202006271428.del
MON_GET_APPL_LOCKWAIT_202006271427.del MON_GET_SERVERLIST_202006271429.del
MON_GET_APPL_LOCKWAIT_202006271428.del MON_GET_TRANSACTION_LOG_202006271421.del
MON_GET_APPL_LOCKWAIT_202006271429.del MON_GET_TRANSACTION_LOG_202006271422.del
MON_GET_CONNECTION_202006271421.del MON_GET_TRANSACTION_LOG_202006271423.del
MON_GET_CONNECTION_202006271424.del MON_GET_TRANSACTION_LOG_202006271424.del
MON_GET_CONNECTION_202006271427.del MON_GET_TRANSACTION_LOG_202006271425.del
MON_GET_EXTENDED_LATCH_WAIT_202006271421.del MON_GET_TRANSACTION_LOG_202006271426.del
MON_GET_EXTENDED_LATCH_WAIT_202006271422.del MON_GET_TRANSACTION_LOG_202006271427.del
MON_GET_EXTENDED_LATCH_WAIT_202006271423.del MON_GET_TRANSACTION_LOG_202006271428.del
MON_GET_EXTENDED_LATCH_WAIT_202006271424.del MON_GET_TRANSACTION_LOG_202006271429.del
MON_GET_EXTENDED_LATCH_WAIT_202006271425.del MON_GET_UNIT_OF_WORK_202006271422.del
MON_GET_EXTENDED_LATCH_WAIT_202006271426.del MON_GET_UNIT_OF_WORK_202006271424.del
MON_GET_EXTENDED_LATCH_WAIT_202006271427.del MON_GET_UNIT_OF_WORK_202006271426.del
MON_GET_EXTENDED_LATCH_WAIT_202006271428.del MON_GET_UNIT_OF_WORK_202006271428.del
MON_GET_EXTENDED_LATCH_WAIT_202006271429.del MON_GET_UTILITY_202006271421.del
MON_GET_EXTENT_MOVEMENT_STATUS_202006271421.del MON_GET_UTILITY_202006271422.del
MON_GET_EXTENT_MOVEMENT_STATUS_202006271422.del MON_GET_UTILITY_202006271423.del
MON_GET_EXTENT_MOVEMENT_STATUS_202006271423.del MON_GET_UTILITY_202006271424.del
MON_GET_EXTENT_MOVEMENT_STATUS_202006271424.del MON_GET_UTILITY_202006271425.del
MON_GET_EXTENT_MOVEMENT_STATUS_202006271425.del MON_GET_UTILITY_202006271426.del
MON_GET_EXTENT_MOVEMENT_STATUS_202006271426.del MON_GET_UTILITY_202006271427.del
MON_GET_EXTENT_MOVEMENT_STATUS_202006271427.del MON_GET_UTILITY_202006271428.del
MON_GET_EXTENT_MOVEMENT_STATUS_202006271428.del MON_GET_UTILITY_202006271429.del
MON_GET_EXTENT_MOVEMENT_STATUS_202006271429.del MON_GET_WORKLOAD_202006271425.del
MON_GET_INDEX_202006271425.del NETSTAT_202006271421.del
MON_GET_LATCH_202006271421.del NETSTAT_202006271422.del
MON_GET_LATCH_202006271422.del NETSTAT_202006271423.del
MON_GET_LATCH_202006271423.del NETSTAT_202006271424.del
MON_GET_LATCH_202006271424.del NETSTAT_202006271425.del
MON_GET_LATCH_202006271425.del NETSTAT_202006271426.del
MON_GET_LATCH_202006271426.del NETSTAT_202006271427.del
MON_GET_LATCH_202006271427.del NETSTAT_202006271428.del
MON_GET_LATCH_202006271428.del NETSTAT_202006271429.del
MON_GET_LATCH_202006271429.del task_details_copy.json
MON_GET_MEMORY_POOL_202006271421.del VMSTAT_202006271421.del
MON_GET_MEMORY_POOL_202006271422.del VMSTAT_202006271422.del
MON_GET_MEMORY_POOL_202006271423.del VMSTAT_202006271423.del
MON_GET_MEMORY_POOL_202006271424.del VMSTAT_202006271424.del
MON_GET_MEMORY_POOL_202006271425.del VMSTAT_202006271425.del
MON_GET_MEMORY_POOL_202006271426.del VMSTAT_202006271426.del
MON_GET_MEMORY_POOL_202006271427.del VMSTAT_202006271427.del
MON_GET_MEMORY_POOL_202006271428.del VMSTAT_202006271428.del
MON_GET_MEMORY_POOL_202006271429.del VMSTAT_202006271429.del

After a few hours, I can see that each hours data collection folders has been tar+gzipped and moved to the archive folder:

(db2inst1@hotellnx112) /home/db2inst1/sqllib/db2dump/IBMHIST_TESTDB1_archive
$ ls

TESTDB1_2020062716.tar.gz

TESTDB1_2020062717.tar.gz

TESTDB1_2020062718.tar.gz
TESTDB1_2020062719.tar.gz

The IBMHIST Control Tables

As mentioned earlier, the historical monitoring framework uses a set of control tables in order to operate. We can see the tables here:

(db2inst1@hotellnx112) /home/db2inst1/db2histmon-master/1_setup/
$ db2 connect to testdb1
$ db2 list tables for schema IBMHIST

Table/View Schema Type Creation time
------------------------------- --------------- ----- --------------------------
TAB_CONFIG IBMHIST T 2020-06-27-14.12.07.943550
TAB_DIRS IBMHIST T 2020-06-27-14.12.08.907709
TAB_ERRS IBMHIST T 2020-06-27-14.12.08.267541
TAB_TASKS IBMHIST T 2020-06-27-14.12.08.521290

The IBMHIST.TAB_CONFIG table contains information about the configuration parameters/arguments that were passed into the setup.py script (or the default value, if they were not specified).

(db2inst1@hotellnx112) /home/db2inst1/
$ db2 "select CONFIG_NAME, VALUE, DESCRIPTION from IBMHIST.TAB_CONFIG"

CONFIG_NAME VALUE DESCRIPTION
-------------------- ---------------------------------------- ----------------------------------------
COLL_PATH /home/db2inst1/sqllib/db2dump DIRECTORY PATH OF DATA COLLECTION
ARCH_PATH /home/db2inst1/sqllib/db2dump DIRECTORY PATH OF DATA ARCHIVAL
MAX_SIZE 1073741824 MAX SIZE OF COLLECTION AND ARCHIVAL
ARCH_CMD tar -caf _dest_ _src_ COMMAND USED TO ARCHIVE HOURLY DIRECTORI
ARCH_EXT .tar.gz EXTENSION OF ARCHIVE HOURLY DIRECTORIES
TASK_DETAILS_PATH /home/hotellnx112/db2inst1/db2histmon-ma LOCATION OF task_details.json FILE

6 record(s) selected.

The IBMHIST.TAB_DIRS table contains information about the current (and any previously specified) data collection paths (coll_path), and a history of each data archive path (arch_path), including the size of the archived file.

For example, we can see the current hourly data collection path as so:

(db2inst1@hotellnx112) /home/db2inst1/
$ db2 "select PATH, SIZE, STATUS, TIME from IBMHIST.TAB_DIRS where STATUS='COLL'"

PATH SIZE STATUS TIME
---------------------------------------------------------------------- -------------------- ------ --------------------------
/home/db2inst1/sqllib/db2dump/IBMHIST_TESTDB1/TESTDB1_2020062910 5133004 COLL 2020-06-29-10.43.00.000000

1 record(s) selected.

Or we can see a list of all the previously archived hourly archived folders as so:

(db2inst1@hotellnx112) /home/db2inst1/
$ db2 "select PATH, SIZE, STATUS, TIME from IBMHIST.TAB_DIRS where STATUS='ARCH' order by TIME asc"

PATH SIZE STATUS TIME
---------------------------------------------------------------------- -------------------- ------ --------------------------
/home/db2inst1/sqllib/db2dump/IBMHIST_TESTDB1_archive/TESTDB1_20200627 328432 ARCH 2020-06-27-16.59.00.000000
/home/db2inst1/sqllib/db2dump/IBMHIST_TESTDB1_archive/TESTDB1_20200627 1901572 ARCH 2020-06-27-17.59.00.000000
/home/db2inst1/sqllib/db2dump/IBMHIST_TESTDB1_archive/TESTDB1_20200627 1923296 ARCH 2020-06-27-18.59.00.000000
...etc...

15 record(s) selected.

The IBMHIST.TAB_TASKS table contains information about each data collection task, as was defined in the task_detail.json file when the full setup.py was last performed (or refreshed).

For example, we can list each active data collection task including it's operating system command or sql query:

(db2inst1@hotellnx112) /home/db2inst1/sqllib/
$ db2 "select COLL_NAME, CLASS, COMMAND from IBMHIST.TAB_TASKS"

COLL_NAME CLASS COMMAND
-------------------- ----- ----------------------------------------------------------------------
ENV_GET_SYSTEM_RESOU SQL SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( SYSPROC.
MON_GET_AGENT SQL SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_CONNECTION SQL SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_ACTIVITY SQL SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_UNIT_OF_WORK SQL SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_CURRENT_SQL SQL SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM SYSIBMADM.MON_CU
MON_GET_APPL_LOCKWAI SQL SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE( MON_GET_A
MON_GET_EXTENDED_LAT SQL SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_MEMORY_POOL SQL SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE (MON_GET_M
MON_GET_MEMORY_SET SQL SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_TRANSACTION_ SQL SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
DB_GET_CFG SQL SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( DB_GET_C
DBMCFG SQL SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM SYSIBMADM.DBMCFG
ENV_INST_INFO SQL SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM SYSIBMADM.ENV_IN
ENV_GET_REG_VARIABLE SQL SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( ENV_GET_
MON_GET_EXTENT_MOVEM SQL SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_GROUP_BUFFER SQL SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_INDEX SQL SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_LATCH SQL SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_BUFFERPOOL SQL SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_PAGE_ACCESS_ SQL SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_PKG_CACHE_ST SQL SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_REBALANCE_ST SQL SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_SERVERLIST SQL SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_TABLE SQL SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_TABLESPACE SQL SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_UTILITY SQL SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_WORKLOAD SQL SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
VMSTAT SYS vmstat
IOSTAT SYS iostat
NETSTAT SYS netstat

For SQL class tasks, we can also list all the column names of the generated data by examining the HEADER column:

(db2inst1@hotellnx112) /home/db2inst1/
$ db2 "select COLL_NAME, HEADER as COLUMN_NAMES from IBMHIST.TAB_TASKS where CLASS='SQL'"

COLL_NAME COLUMN_NAMES
-------------------- ----------------------------------------------------------------------
ENV_GET_SYSTEM_RESOU COLLECTION_TIME,MEMBER,OS_NAME,HOST_NAME,OS_VERSION,OS_RELEASE,MACHINE
MON_GET_AGENT COLLECTION_TIME,SERVICE_SUPERCLASS_NAME,SERVICE_SUBCLASS_NAME,APPLICAT
MON_GET_CONNECTION COLLECTION_TIME,APPLICATION_HANDLE,APPLICATION_NAME,APPLICATION_ID,MEM
MON_GET_ACTIVITY COLLECTION_TIME,APPLICATION_HANDLE,DBPARTITIONNUM,COORD_PARTITION_NUM,
MON_GET_UNIT_OF_WORK COLLECTION_TIME,SERVICE_SUPERCLASS_NAME,SERVICE_SUBCLASS_NAME,SERVICE_
MON_CURRENT_SQL COLLECTION_TIME,COORD_MEMBER,APPLICATION_HANDLE,APPLICATION_NAME,SESSI
MON_GET_APPL_LOCKWAI COLLECTION_TIME,LOCK_WAIT_START_TIME,LOCK_NAME,LOCK_OBJECT_TYPE_ID,LOC
MON_GET_EXTENDED_LAT COLLECTION_TIME,MEMBER,LATCH_NAME,TOTAL_EXTENDED_LATCH_WAITS,TOTAL_EXT
MON_GET_MEMORY_POOL COLLECTION_TIME,MEMBER,HOST_NAME,DB_NAME,MEMORY_SET_TYPE,MEMORY_POOL_T
MON_GET_MEMORY_SET COLLECTION_TIME,MEMBER,HOST_NAME,DB_NAME,MEMORY_SET_TYPE,MEMORY_SET_ID
MON_GET_TRANSACTION_ COLLECTION_TIME,MEMBER,TOTAL_LOG_AVAILABLE,TOTAL_LOG_USED,SEC_LOG_USED
DB_GET_CFG COLLECTION_TIME,NAME,VALUE,VALUE_FLAGS,DEFERRED_VALUE,DEFERRED_VALUE_F
DBMCFG COLLECTION_TIME,NAME,VALUE,VALUE_FLAGS,DEFERRED_VALUE,DEFERRED_VALUE_F
ENV_INST_INFO COLLECTION_TIME,INST_NAME,IS_INST_PARTITIONABLE,NUM_DBPARTITIONS,INST_
ENV_GET_REG_VARIABLE COLLECTION_TIME,MEMBER,REG_VAR_NAME,REG_VAR_VALUE,REG_VAR_ON_DISK_VALU
MON_GET_EXTENT_MOVEM COLLECTION_TIME,TBSP_NAME,TBSP_ID,MEMBER,CURRENT_EXTENT,LAST_EXTENT,NU
MON_GET_GROUP_BUFFER COLLECTION_TIME,MEMBER,NUM_GBP_FULL
MON_GET_INDEX COLLECTION_TIME,TABSCHEMA,TABNAME,IID,MEMBER,DATA_PARTITION_ID,NLEAF,N
MON_GET_LATCH COLLECTION_TIME,LATCH_NAME,MEMORY_ADDRESS,EDU_ID,EDU_NAME,APPLICATION_
MON_GET_BUFFERPOOL COLLECTION_TIME,BP_NAME,MEMBER,AUTOMATIC,DIRECT_READS,DIRECT_READ_REQS
MON_GET_PAGE_ACCESS_ COLLECTION_TIME,TABSCHEMA,TABNAME,OBJTYPE,MEMBER,PAGE_RECLAIMS_X,PAGE_
MON_GET_PKG_CACHE_ST COLLECTION_TIME,MEMBER,SECTION_TYPE,INSERT_TIMESTAMP,EXECUTABLE_ID,PAC
MON_GET_REBALANCE_ST COLLECTION_TIME,TBSP_NAME,TBSP_ID,DBPARTITIONNUM,MEMBER,REBALANCER_MOD
MON_GET_SERVERLIST COLLECTION_TIME,MEMBER,CACHED_TIMESTAMP,HOSTNAME,PORT_NUMBER,SSL_PORT_
MON_GET_TABLE COLLECTION_TIME,TABSCHEMA,TABNAME,MEMBER,TAB_TYPE,TAB_FILE_ID,DATA_PAR
MON_GET_TABLESPACE COLLECTION_TIME,TBSP_NAME,TBSP_ID,MEMBER,TBSP_TYPE,TBSP_CONTENT_TYPE,T
MON_GET_UTILITY COLLECTION_TIME,MEMBER,COORD_MEMBER,APPLICATION_HANDLE,APPLICATION_ID,
MON_GET_WORKLOAD COLLECTION_TIME,WORKLOAD_NAME,WORKLOAD_ID,MEMBER,ACT_ABORTED_TOTAL,ACT

28 record(s) selected.

The IBMHIST.TAB_ERRS will contain a history of any execution errors that occurred by the historical monitoring framework:

(db2inst1@hotellnx112) /home/db2inst1/
$ db2 "select TIME, CALLER, COLL_NAME, ERRMSG from IBMHIST.TAB_ERRS"

TIME CALLER COLL_NAME 4
-------------------------- -------------------- -------------------- ------------------------------------------------------------
2020-06-27-14.51.29.200058 PROC_COLLECT DUMMY_SQL_TASK Failed to write to data collection path: /home/db2inst1/MYHI...etc...

1 record(s) selected.

Example of modifying a setup parameter (changing the data collection path):

First we create a new folder to host the active data collection:

(db2inst1@hotellnx112) /home/db2inst1/
$ mkdir /home/db2inst1/MYHISTMON_COLLECTION_TESTDB1

Since the data collections are performed by external UDFs under a fenced userid (not the db2-instance-owner userid), our new folder must be accessible to the fenced userid, for simplicity I'm giving the folder permission to everyone:

(db2inst1@hotellnx112) /home/db2inst1/
$ chmod 777 MYHISTMON_COLLECTION_TESTDB1

Next I execute the setup script, specifying the --coll_path option with our new path. I also specify the --update_config_only option, since I did not modify the task_details.json file and do not need to update all the data collection tasks.

(db2inst1@hotellnx112) /home/dbb2inst1/db2histmon-master/1_setup/
$ python3 setup.py testdb1 --coll_path /home/db2inst1MYHISTMON_COLLECTION_TESTDB1/ --update_config_only

Connecting to database: TESTDB1
Configuring IBMHIST settings ...
Deleting configurations from IBMHIST.TAB_CONFIG ...
Setting COLL_PATH to '/home/db2inst1/MYHISTMON_COLLECTION_TESTDB1' ...
Setting ARCH_PATH to '/home/db2inst1/sqllib/db2dump' ...
Setting MAX_SIZE to '1073741824' ...
Setting ARCH_CMD to 'tar -caf _dest_ _src_' and ARCH_EXT to '.tar.gz' ...
Setting TASK_DETAILS_PATH to '/home/hotellnx112/db2inst1/db2histmon-master/1_setup/task_details.json' ...
Closing connection ...
Done

All now all subsequent data collection tasks will be put into the new data collection path.

(Note that existing data in the old collection path will be automatically moved into the new path).

(db2inst1@hotellnx112) /home/db2inst1/MYHISTMON_COLLECTION_TESTDB1/IBMHIST_TESTDB1/TESTDB1_2020062913
$ ls

IOSTAT_202006291304.del
IOSTAT_202006291305.del
IOSTAT_202006291306.del
IOSTAT_202006291307.del
IOSTAT_202006291308.del
MON_GET_EXTENT_MOVEMENT_STATUS_202006291304.del
MON_GET_EXTENT_MOVEMENT_STATUS_202006291305.del
MON_GET_EXTENT_MOVEMENT_STATUS_202006291306.del
MON_GET_EXTENT_MOVEMENT_STATUS_202006291307.del
...etc...
...etc...

Example of Adding a new data collection task (a new operating system metric):

In this example, I want to add a new task to collect "ps -elf" output from the operating system, every minute.

First, we edit our task_details.json file to add a new entry to the bottom for our new task called 'PS_ELF' as so:
(note, I can probably just leave the collection_condition field blank, since I do not intend to run this task_details.json file in a Windows environment, and thus always executing this data collectiont ask would be fine, but for completeness I'll add UNIX as a collection_condition, since 'ps -elf' makes no sense in Windows).

<db2inst1@hotellnx112> /home/db2inst1/db2histmon-master/1_setup/
$ vim task_details.json

...etc...
...etc...
,
{
"collection_name": "PS_ELF",
"collection_class": "SYS",
"collection_command": "ps -elf",
"collection_freq": "* * * * *",
"collection_level": "1",
"collection_condition": "UNIX",
"loader_join_columns": "",
"loader_diff_exempt_columns": "",
"quickparse_summary_columns": ""
}

Next, I executed the setup.py script. Since I'm only adding a new task, I'll just use the --update_tasks_only option. We can see our new PS_ELF task included:

<db2inst1@hotellnx112> /home/db2inst1/db2histmon-master/1_setup/
$ python3 setup.py testdb1 --update_tasks_only

Connecting to database: TESTDB1
Determining environment ...
Determining if Windows or Unix ...
Unix detected ...
Determining if pureScale ...
pureScale not detected ...
Determining if HADR ...
HADR not detected ...
Unscheduling tasks ...
Scheduling tasks ...
Setting Admin Task Scheduler registry variable ...
Reading from task_details.json file ...
Scheduling task: ENV_GET_SYSTEM_RESOURCES
Scheduling task: MON_GET_AGENT
Scheduling task: MON_GET_CONNECTION
Scheduling task: MON_GET_ACTIVITY
Scheduling task: MON_GET_UNIT_OF_WORK
Scheduling task: MON_CURRENT_SQL
Scheduling task: MON_GET_APPL_LOCKWAIT
Scheduling task: MON_GET_EXTENDED_LATCH_WAIT
Scheduling task: MON_GET_MEMORY_POOL
Scheduling task: MON_GET_MEMORY_SET
Scheduling task: MON_GET_TRANSACTION_LOG
Scheduling task: DB_GET_CFG
Scheduling task: DBMCFG
Scheduling task: ENV_INST_INFO
Scheduling task: ENV_GET_REG_VARIABLES
Scheduling task: MON_GET_EXTENT_MOVEMENT_STATUS
Scheduling task: MON_GET_GROUP_BUFFERPOOL
Scheduling task: MON_GET_INDEX
Scheduling task: MON_GET_LATCH
Scheduling task: MON_GET_BUFFERPOOL
Scheduling task: MON_GET_PAGE_ACCESS_INFO
Scheduling task: MON_GET_PKG_CACHE_STMT
Scheduling task: MON_GET_REBALANCE_STATUS
Scheduling task: MON_GET_SERVERLIST
Scheduling task: MON_GET_TABLE
Scheduling task: MON_GET_TABLESPACE
Scheduling task: MON_GET_UTILITY
Scheduling task: MON_GET_WORKLOAD
Scheduling task: VMSTAT
Scheduling task: IOSTAT
Scheduling task: NETSTAT
Scheduling task: PS_ELF
Scheduling task: ARCHIVE

Collection will begin in approximately 5 minutes ...

Closing connection ...
Done

If we examine the IBMHIST.TAB_TASKS table, we can also see the new PS_ELF data collection task:

db2inst1@hotellnx112> /home/db2inst1/db2histmon-master/1_setup/
$ db2 "select COLL_NAME from IBMHIST.TAB_TASKS where COLL_NAME='PS_ELF'"

COLL_NAME
----------------------------------------------------------------
PS_ELF

1 record(s) selected.

If we examine the SYSTOOLS.ADMIN_TASK_LIST table, we can see that our PS_ELF task has been added to the admin task scheduler:

db2inst1@hotellnx112> /home/db2inst1/
$ db2 "SELECT NAME, PROCEDURE_NAME, REMARKS from SYSTOOLS.ADMIN_TASK_LIST"

NAME PROCEDURE_NAME REMARKS
------------------------------ -------------------- ------------------------------
MON_GET_LATCH PROC_COLLECT IBMHIST Collection
MON_GET_BUFFERPOOL PROC_COLLECT IBMHIST Collection
MON_GET_PAGE_ACCESS_INFO PROC_COLLECT IBMHIST Collection
MON_GET_PKG_CACHE_STMT PROC_COLLECT IBMHIST Collection
MON_GET_REBALANCE_STATUS PROC_COLLECT IBMHIST Collection
MON_GET_SERVERLIST PROC_COLLECT IBMHIST Collection
MON_GET_TABLE PROC_COLLECT IBMHIST Collection
MON_GET_TABLESPACE PROC_COLLECT IBMHIST Collection
MON_GET_UTILITY PROC_COLLECT IBMHIST Collection
MON_GET_WORKLOAD PROC_COLLECT IBMHIST Collection
VMSTAT PROC_COLLECT IBMHIST Collection
IOSTAT PROC_COLLECT IBMHIST Collection
NETSTAT PROC_COLLECT IBMHIST Collection
ARCHIVE PROC_ARCHIVE IBMHIST Archival
PS_ELF PROC_COLLECT IBMHIST Collection
MON_GET_FCM PROC_COLLECT IBMHIST Collection
MON_GET_AGENT PROC_COLLECT IBMHIST Collection
MON_GET_CONNECTION PROC_COLLECT IBMHIST Collection
MON_GET_ACTIVITY PROC_COLLECT IBMHIST Collection
MON_GET_UNIT_OF_WORK PROC_COLLECT IBMHIST Collection
MON_CURRENT_SQL PROC_COLLECT IBMHIST Collection
MON_GET_APPL_LOCKWAIT PROC_COLLECT IBMHIST Collection
MON_GET_EXTENDED_LATCH_WAIT PROC_COLLECT IBMHIST Collection
MON_GET_MEMORY_POOL PROC_COLLECT IBMHIST Collection
MON_GET_MEMORY_SET PROC_COLLECT IBMHIST Collection
MON_GET_TRANSACTION_LOG PROC_COLLECT IBMHIST Collection
DB_GET_CFG PROC_COLLECT IBMHIST Collection
DBMCFG PROC_COLLECT IBMHIST Collection
ENV_INST_INFO PROC_COLLECT IBMHIST Collection
ENV_GET_REG_VARIABLES PROC_COLLECT IBMHIST Collection
MON_GET_EXTENT_MOVEMENT_STATUS PROC_COLLECT IBMHIST Collection
MON_GET_GROUP_BUFFERPOOL PROC_COLLECT IBMHIST Collection
MON_GET_INDEX PROC_COLLECT IBMHIST Collection
ENV_GET_SYSTEM_RESOURCES PROC_COLLECT IBMHIST Collection

34 record(s) selected.

Lastly, if we wait approximately 5mins for the Db2 Admin Task Scheduler to process the new task, we can then begin seeing the output of our new PS_ELF task within the hourly data collection folder:

<db2inst1@hotellnx112> /home/db2inst1/sqllib/db2dump/IBMHIST_TESTDB1/TESTDB1_2020062717/
$ ls -l

DB_GET_CFG_202006271700.del
DBMCFG_202006271700.del
...etc...

...etc...

PS_ELF_202006271700.del

...etc...

Example of Adding a new data collection task (a new sql query):

In this example, I want to add a new task to collect MON_GET_FCM() output every 10mins.

First, I'll edit the task_details.json file to add a new entry to the bottom for our new task called 'MON_GET_FCM', as so:

<db2inst1@hotellnx112> /home/db2inst1/db2histmon-master/1_setup/
$ vim task_details.json

...etc...
...etc...
,
{
"collection_name": "MON_GET_FCM",
"collection_class": "SQL",
"collection_command": "SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE (MON_GET_FCM(-2)) T",
"collection_freq": "10 * * * *",
"collection_level": "1",
"collection_condition": "",
"loader_join_columns": "",
"loader_diff_exempt_columns": "",
"quickparse_summary_columns": ""
}

Next, I'll invoke the setup.py script. Here I use the --update_tasks_only option. We can see the new MON_GET_FCM task added:

<db2inst1@hotellnx112> /home/db2inst1/db2histmon-master/1_setup.
$ python3 setup.py testdb1 --update_tasks_only

Connecting to database: TESTDB1
Determining environment ...
Determining if Windows or Unix ...
Unix detected ...
Determining if pureScale ...
pureScale not detected ...
Determining if HADR ...
HADR not detected ...
Unscheduling tasks ...
Scheduling tasks ...
Setting Admin Task Scheduler registry variable ...
Reading from task_details.json file ...
Scheduling task: ENV_GET_SYSTEM_RESOURCES
Scheduling task: MON_GET_AGENT
Scheduling task: MON_GET_CONNECTION
Scheduling task: MON_GET_ACTIVITY
Scheduling task: MON_GET_UNIT_OF_WORK
Scheduling task: MON_CURRENT_SQL
Scheduling task: MON_GET_APPL_LOCKWAIT
Scheduling task: MON_GET_EXTENDED_LATCH_WAIT
Scheduling task: MON_GET_MEMORY_POOL
Scheduling task: MON_GET_MEMORY_SET
Scheduling task: MON_GET_TRANSACTION_LOG
Scheduling task: DB_GET_CFG
Scheduling task: DBMCFG
Scheduling task: ENV_INST_INFO
Scheduling task: ENV_GET_REG_VARIABLES
Scheduling task: MON_GET_EXTENT_MOVEMENT_STATUS
Scheduling task: MON_GET_GROUP_BUFFERPOOL
Scheduling task: MON_GET_INDEX
Scheduling task: MON_GET_LATCH
Scheduling task: MON_GET_BUFFERPOOL
Scheduling task: MON_GET_PAGE_ACCESS_INFO
Scheduling task: MON_GET_PKG_CACHE_STMT
Scheduling task: MON_GET_REBALANCE_STATUS
Scheduling task: MON_GET_SERVERLIST
Scheduling task: MON_GET_TABLE
Scheduling task: MON_GET_TABLESPACE
Scheduling task: MON_GET_UTILITY
Scheduling task: MON_GET_WORKLOAD
Scheduling task: VMSTAT
Scheduling task: IOSTAT
Scheduling task: NETSTAT
Scheduling task: PS_ELF
Scheduling task: MON_GET_FCM
Scheduling task: ARCHIVE

Collection will begin in approximately 5 minutes ...

Closing connection ...
Done

We can see our new task ash been added to the IBMHIST.TAB_TASKS table:

<db2inst1@hotellnx112> /home/db2inst1/
$ db2 "select COLL_NAME from IBMHIST.TAB_TASKS where COLL_NAME='MON_GET_FCM'"

COLL_NAME
----------------------------------------------------------------
MON_GET_FCM

1 record(s) selected.

Lastly, if we wait approximately 5mins for the Db2 Admin Task Scheduler to process the new task, we can then begin seeing the output of our new MON_GET_FCM task within the hourly data collection folder:

<db2inst1@hotellnx112> /home/db2inst1/sqllib/db2dump/IBMHIST_TESTDB1/TESTDB1_2020062914/
$ ls -l

DB_GET_CFG_202006291400.del
DBMCFG_202006291400.del
...etc...
...etc...
MON_GET_FCM_202006291439.del
...etc...
...etc...

Example of Removing an data collection task:

The cleanest way to remove a data collection task is to edit the task_details.json file, remove the desired task entry, and then run the setup.py script again, using the --update_tasks_only option.

For example, if I wanted to remove the iostat data collection task, I would remove this entry from the task_details.json file:

<db2inst1@hotellnx112> /home/db2inst1/db2histmon-master/1_setup/
$ vim task_details.json

...etc...
...etc...
{
~~"collection_name": "IOSTAT",~~
~~"collection_class": "SYS",~~
~~"collection_command": "iostat",~~
~~"collection_freq": "* * * * *",~~
~~"collection_level": "1",~~
~~"collection_condition": "UNIX",~~
~~"loader_join_columns": "",~~
~~"loader_diff_exempt_columns": "",~~
~~"quickparse_summary_columns": ""~~
},
...etc...
...etc...

And the re-run the setup.py script using the --update_tasks_only option:

<db2inst1@hotellnx112> /home/db2inst1/db2histmon-master/1_setup/

$ python3 setup.py testdb1 --update_tasks_only

Connecting to database: TESTDB1
Determining environment ...
Determining if Windows or Unix ...
Unix detected ...
Determining if pureScale ...
pureScale not detected ...
Determining if HADR ...
HADR not detected ...
Unscheduling tasks ...
Scheduling tasks ...
Setting Admin Task Scheduler registry variable ...
Reading from task_details.json file ...
Scheduling task: ENV_GET_SYSTEM_RESOURCES
Scheduling task: MON_GET_AGENT
Scheduling task: MON_GET_CONNECTION
Scheduling task: MON_GET_ACTIVITY
Scheduling task: MON_GET_UNIT_OF_WORK
Scheduling task: MON_CURRENT_SQL
Scheduling task: MON_GET_APPL_LOCKWAIT
Scheduling task: MON_GET_EXTENDED_LATCH_WAIT
Scheduling task: MON_GET_MEMORY_POOL
Scheduling task: MON_GET_MEMORY_SET
Scheduling task: MON_GET_TRANSACTION_LOG
Scheduling task: DB_GET_CFG
Scheduling task: DBMCFG
Scheduling task: ENV_INST_INFO
Scheduling task: ENV_GET_REG_VARIABLES
Scheduling task: MON_GET_EXTENT_MOVEMENT_STATUS
Scheduling task: MON_GET_GROUP_BUFFERPOOL
Scheduling task: MON_GET_INDEX
Scheduling task: MON_GET_LATCH
Scheduling task: MON_GET_BUFFERPOOL
Scheduling task: MON_GET_PAGE_ACCESS_INFO
Scheduling task: MON_GET_PKG_CACHE_STMT
Scheduling task: MON_GET_REBALANCE_STATUS
Scheduling task: MON_GET_SERVERLIST
Scheduling task: MON_GET_TABLE
Scheduling task: MON_GET_TABLESPACE
Scheduling task: MON_GET_UTILITY
Scheduling task: MON_GET_WORKLOAD
Scheduling task: VMSTAT
Scheduling task: NETSTAT
Scheduling task: PS_ELF
Scheduling task: MON_GET_FCM
Scheduling task: ARCHIVE

Collection will begin in approximately 5 minutes ...

Closing connection ...
Done

Example of Stopping all data collection tasks:

The easiest way to stop all data collection tasks is to simply disable and remove the historical monitoring framework from the database, using the setup.py --cleanup argument:

<db2inst1@hotellnx112> /home/db2inst1/db2histmon-master/1_setup/
$ python3 setup.py testdb1 --cleanup

Connecting to database: TESTDB1
Unscheduling tasks ...
Dropping IBMHIST schema and its objects ...
Closing connection ...
Done

You can validate that no data collection tasks are scheduled within the Admin Task Scheduler by:

<db2inst1@hotellnx112> /home/db2inst1/db2histmon-master/1_setup.
$ db2 "SELECT NAME, PROCEDURE_NAME from SYSTOOLS.ADMIN_TASK_LIST"

NAME PROCEDURE_NAME
----------------------- --------------------------------------------------------

0 record(s) selected.

Stay tuned for new blogs expanding on the Db2 Historical Monitoring (db2histmon) suite, coming soon.

ThinkingDb2!

Tuesday, 30 June 2020

Enabling Db2 Historical Monitoring (db2histmon) - continuous diagnostic data collection and archival

Downloading the Db2 Historical Monitoring (db2histmon) scripts:

Pre-requisites:

Overview of the Db2 Historical Monitoring framework:

Setup Folder:

Overview of the task_details.json file:

setup.py usage and arguments:

Example of running the setup.py script:

The Data Collection and Data Archive folders:

Example of modifying a setup parameter (changing the data collection path):

Example of Adding a new data collection task (a new operating system metric):

Example of Adding a new data collection task (a new sql query):

Example of Removing an data collection task:

Example of Stopping all data collection tasks:

No comments:

Post a Comment

Translate