Tuesday 30 June 2020

Enabling Db2 Historical Monitoring (db2histmon) - continuous diagnostic data collection and archival

- David Sciaraffa, Software Engineering Manager – IBM Db2


When a problem with Db2 arises, identification of the problem's root cause by IBM Db2 support and development is often directly influenced by the availability of diagnostic information before, during, and after the time-frame of the problem event. When the diagnostic messages in the db2diag.log are insufficient, it is not uncommon for IBM Db2 support or development to provide instructions to enable a specific set of diagnostic data, and wait for a re-occurrence of the problem, in order to begin narrowing the root cause. 

The new Db2 Historical Monitoring (db2histmon) scripts (available here) collect a broad range of Db2 and Operating System diagnostic data, and retain this data for a period of time, allowing for basic triage of many types of issues. Diagnostic information is collected at various intervals (ranging from 1min to 1hour, depending on the kind of data), and can be easily customized and expanded. The scripts will collect data only while the database is activated, which can help to reduce the generation of unnecessary redundant data.

By enabling the collection of Db2 Historical Monitoring (db2histmon), you can improve the odds that helpful diagnostic information is available after the first occurrence of a problem.

This blog provides an overview of how to enable and customize the new Db2 Historical Monitoring (db2histmon) scripts.



Downloading the Db2 Historical Monitoring (db2histmon) scripts:

The scripts are available and can be downloaded from the open-source Github repo: https://github.com/IBM/db2histmon


Pre-requisites:

  • The scripts can be deployed in any Unix or Windows environment where Db2 is supported.
  • Python2 or higher, as well as the python ibm_db module are required to execute the scripts.
  • The scripts can be deployed on any relatively modern version of Db2 (Version 10.1, 10.5, 11.1, 11.5). 
  • Db2 DBADM authority is required on the database where the historical monitoring will be deployed.
  • A C++ compiler (for example g++ on Unix or Visual Studio on Windows) is required (in order for the scripts to use bldrtn to compile the external C++ UDFs which are used by the historical monitoring framework).

Overview of the Db2 Historical Monitoring framework:

The historical monitoring scripts will utilize a set of control tables, procedures and external UDFs to operate, with the DB2 Admin Task Scheduler executing the data collection tasks at the desired frequency (minimum 1min), and will generate diagnostic/monitoring output into the active data collection folder first, and then archived into the data archival folder every hour. Consider the following process flow diagram:

Consider the following high-level process flow diagram:



Setup Folder:

After downloading the scripts to your local database server system, you'll notice three folders, 1_setup, 2_loader, and 3_quickparser:

(db2inst1@hotellnx112) /home/db2inst1/db2histmon-master/
$ ls
 
1_setup/  2_loader/   3_quickparser/


For the purpose of this blog, we'll be focusing on the 1_setup/ folder (which is used to setup and deploy the historical monitoring framework and begin data collection). In a subsequent blog I will provide an overview of the 2_loader script (which is used to easily load the data collections into a database so it can be queried), and the 3_quickparser script (which is used to display the data collections to a terminal window in a columnar format for easier viewing).

Within the 1_setup/ folder, we find three files, and sub-folder named sql/

(db2inst1@hotellnx112) /home/db2inst1/db2histmon-master/1_setup/
$ ls
 
README.md  setup.py  sql/  task_details.json

  • The README.md file contains a technical description of the setup scripts, input options, as well as a brief architectural overview of how the scripts work. The purpose of this blog is to expand on this information with some visual examples.

  • The setup.py file is the main setup script, described in detail below.

  • The task_details.json file, described in detail below, contains the definition of each data collection task, and is used by the setup.py script to deploy each task.

Within the 1_setup/sql/ folder, we find the definition of the control tables, procedures, and user-defined functions which the setup.py script will create in the specified database. The details of these files would mostly be useful to developers or administrators who wish to explore and expand on the framework. I do not explore them in detail in this blog.

(db2inst1@hotellnx112) /home/db2inst1/db2histmon-master/1_setup/sql
$ ls
 
external.C
external.exp 
external.sql 
proc_collect.sql
tab_dirs.sql
tab_tasks.sql 
external.def 
external.h 
proc_archive.sql 
tab_config.sql 
tab_errs.sql


Overview of the task_details.json file:

The task_details.json file defines each data collection task, and is used by the setup.py script to deploy each data collection task into the Db2 Admin Task Scheduler.

The default task_details.json file that is included with the scripts looks like so:

(db2inst1@hotellnx112) /home/db2inst1/db2histmon-master/1_setup/
$ less task_details.json

[
    {
      "collection_name": "ENV_GET_SYSTEM_RESOURCES",
      "collection_class": "SQL",
      "collection_command": "SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE(SYSPROC.ENV_GET_SYSTEM_RESOURCES()) T",
      "collection_freq": "* * * * *",
      "collection_level": "1",
      "collection_condition": "",
      "loader_join_columns": "",
      "loader_diff_exempt_columns": "ALL",
      "quickparse_summary_columns": "ALL"
    },

    ...etc...
        ...etc...

    {
      "collection_name": "VMSTAT",
      "collection_class": "SYS",
      "collection_command": "vmstat",
      "collection_freq": "* * * * *",
      "collection_level": "1",
      "collection_condition": "UNIX",
      "loader_join_columns": "",
      "loader_diff_exempt_columns": "",
      "quickparse_summary_columns": ""
    }


  • The "collection_name" defines the name of the collection task (this name will be used later to generate the filename into which this data collection will be output, so avoid using spaces or special characters).
  • The "collection_class" can be either "SYS" or "SQL".  If this task is defined as "SYS" then the collection_command is invoked as a system call to the operating system. If this task is defined as "SQL", then the collection_command is invoked as an sql query against the database.
  • The "collection_command" is the operating system or sql query which generates the desired monitoring or diagnostic information.
  • The "collection_freq" defines the frequency of data collection, and this is specified in cron format. The maximum frequency is every 1 minute. 
  • The "collection_level" can be either "1" or "2".  When the setup.py script is executed to deploy these data collection tasks to the database, the administrator has the option of deploying only a small set of basic data collection (which are those tasks defined with a collection_level value of 1), or an expanded set of data collection (which are those tasks defined with a collection_level value of 2).
  • The "collection_condition" is used to define the environmental condition under which this data collection task should run. As of the time of this writing, the possible values are "HADR", "PURESCALE", "UNIX", "WINDOWS", or nothing. For example, if a task is defined with a collection_condition value of "HADR", then the task is only executed if an db2 HADR environment is detected. If the collection_condition is nothing, then the task is always executed.
  • The "loader_join_columns", "loader_diff_exempt_columns", and "quickparse_summary_columns" will be the focus of a subsequent blog.

setup.py usage and arguments:

Use the setup.py --help option to display usage:

(db2inst1@hotellnx112) /home/db2inst1/db2histmon-master/1_setup/
$ python3 setup.py --help

usage: setup.py [-h] [-un] [-pw] [-c] [-uc] [-ut] [-bp] [-cp] [-ap] [-ms]
                [-acmd] [-aext] [-lvl]
                database

Setup IBMHIST monitoring program. View README for more information.

positional arguments:
  database              name of database

optional arguments:
 -h, --help           show this help message and exit
 -un , --username     username used to connect to database, default is that
                      of current user
 -pw , --password     password used to connect to database, default is that
                      of current user
 -c, --cleanup        will drop the IBMHIST schema and all of its objects,
                      and unschedule its tasks
 -uc, --update_config_only
                      will update config values to passed in arguments or
                      reset to defaults for those not explicitly specified
 -ut, --update_tasks_only
                      will unschedule all old tasks and reschedule to match
                      with the task_details.json file
 -bp , --bldrtn_path  specify path for bldrtn script if it is not
                      automatically found, default:
                      /home/<current_user>/sqllib/samples/cpp/bldrtn
 -cp , --coll_path    directory path of data collection, default:
                      /home/<current user>/sqllib/db2dump
 -ap , --arch_path    directory path of data archival, default:
                      /home/<current user>/sqllib/db2dump
 -ms , --max_size     max size in bytes of collection and archival, default:
                      1073741824
 -acmd , --arch_cmd   command used to archive hourly folders with "_src_"
                      and "_dest_" placeholders, default: tar -caf _dest_
                      _src_
 -aext , --arch_ext   extension of archived hourly folders beginning with
                      ".", default: .tar.gz
 -lvl , --coll_lvl    scope of tasks to schedule (1 will only schedule key
                      tasks, 2 will schedule more), all tasks scheduled if
                      unspecified

  • The [database] name parameter is mandatory and specifies the local database against which the historical monitoring setup will be deployed.  If you have multiple databases in your database instance, and wish to deploy historical monitoring on other these databases, you must perform the setup.py script on each one separately.
  • The --username and --password arguments are used to establish the database connection. These are required if your database is configured to disallow local user IPC connections without a password. (If unspecified, a connection to the database using the current user is used).
  • The --cleanup option will perform a full removal of all previously created IBMHIST control tables, procedures, and external UDFs, and data collection tasks. Any previous data collection files or archived sub-folders will not be removed.
  • The --update_config_only option will perform only the creation (or re-creation) of the IBMHIST control tables, procedures, and external UDFs, and configure (or re-configure) any options/arguments that are specified. Note that default values are used for any options/arguments which are not specified, even if these arguments were specified during a previous invocation of the script.
  • The --update_tasks_only option will perform only the removal of any previously deployed data collection tasks, and will parse of the task_details.json file and (re)deploy all data collections tasks.
    (Note, when neither the --update_config_only or --update_tasks_only options are specified, then both operations are always performed).
  • The --bldrtn_path optional argument specifies the fully qualified path to the sqllib/samples/cpp/bldrtn executable. This is required to compile the external C++ UDFs which perform the collection of operating system 'SYS' data collection tasks. (If unspecified, the default path /home/<current_user>/sqllib/samples/cpp/bldrtn is used).
  • The --coll_path optional argument specifies the fully qualified path where a data collection sub-folder will be created (every hour) and the monitoring/diagnostic data will be placed into. (If unspecified, the default path /home/<current_user>/sqllib/db2dump/ is used).
  • The --arch_path optional argument specifies the fully qualified path to the folder where the data collection folder (every hour) will be moved after it is removed from the active coll_path. (If unspecified, the default path /home/<current_user>/sqllib/db2dump/ is used).
  • The --max_size optional argument specifies the maximum size (in bytes) of archival data to retain within the arch_path. When this limit is breached, hourly data collection folders (starting with the oldest) will be deleted. (If unspecified, the default value 1073741824 bytes (1GB) is used).
  • The --arch_cmd optional argument specifies the operating system command that is used (every hour) to tar/zip/compress the active data collection folder into the archive folder. A filename (in the format '<dbname>_<yyyy><mm><dd><hh>') is automatically generated and should not be specified. Tokens '_src_' and '_dest_' are used as placeholders in place of the coll_path and arch_path respectively. (If unspecified, the default command 'tar -caf _dest_ _src_' is assumed). For example, if one wanted to use the star unix command, instead of the default tar command, one might specify 'star -c -f=_dest_ _src_'.
  • The --arch_ext optional argument specifies the extension to tack onto the archived folder's file after the arch_cmd is completed. (If unspecified, the default extension ".tar.gz" is used).
  • The --coll_level optional argument specifies the level of data collections. If "1" is specified, only the data collection tasks in the task_details.json file with a "collection_level" value of 1 are deployed.  If "2" is specified, any data collection tasks with a "collection_level" value of 1 or 2 are deployed.


Example of running the setup.py script:

Here, we execute the setup.py script against database TESTDB1, using all default arguments:

<db2inst1@hotellnx112> /home/db2inst1/db2histmon-master/1_setup/
$ python3 setup.py testdb1

First, we see the creation of the SYSTOOLSPACE table space (if not already existing), and the determination of environment conditions (UNIX/WINDOWS, HADR, and/or PURESCALE):

Connecting to database: TESTDB1
Creating tablespace SYSTOOLSPACE ...
Determining environment ...
    Determining if Windows or Unix ...
        Unix detected ...
    Determining if pureScale ...
        pureScale not detected ...
    Determining if HADR ...
        HADR not detected ...

Next we see the unscheduling of any previously scheduled tasks:
 
Unscheduling tasks ...

Next we see the creation (or recreation) of the IBMHIST schema, the creation of the IBMHIST control tables, procedures, and compilation of the external UDFs.

Setting up IBMHIST schema and its objects ...
    Creating IBMHIST schema ...
    Setting schema to IBMHIST ...
    Registering external functions ...
        Copying bldrtn script from '/home/db2inst1/sqllib/samples/cpp/bldrtn' to sql/ ...
        Building external functions ...
        Executing external.sql ...
    Registering tables ...
        Executing tab_config.sql ...
        Executing tab_errs.sql ...
        Executing tab_tasks.sql ...
        Executing tab_dirs.sql ...
    Registering procedures ...
        Executing proc_collect.sql ...
        Executing proc_archive.sql ...
    Removing additional files created during setup ...
 
Next we see the configuration of settings based on setup.py input arguments (or default values as in this example):
 
Configuring IBMHIST settings ...
    Deleting configurations from IBMHIST.TAB_CONFIG ...
    Setting COLL_PATH to '/home/db2inst1/sqllib/db2dump' ...
    Setting ARCH_PATH to '/home/db2inst1/sqllib/db2dump' ...
    Setting MAX_SIZE to '1073741824' ...
    Setting ARCH_CMD to 'tar -caf _dest_ _src_' and ARCH_EXT to '.tar.gz' ...
    Setting TASK_DETAILS_PATH to '/home/hotellnx112/db2inst1/db2histmon-master/1_setup/task_details.json' ...

 Next we see some validation tests and sanity checks:
 
Testing IBMHIST functionality ...
    Testing IBMHIST.PROC_COLLECT with a dummy SQL task ...
    Testing IBMHIST.PROC_COLLECT with a dummy SYS task ...
    Testing IBMHIST.PROC_ARCHIVE to archive collection directories ...
    Testing IBMHIST.PROC_ARCHIVE to delete archival directories once max size is reached ...

Next we see the parsing of the task_details.json file, and the registration of data collection tasks to the Db2 Admin Task Scheduler:
 
Scheduling tasks ...
    Setting Admin Task Scheduler registry variable ...
    Reading from task_details.json file ...
    Scheduling task: ENV_GET_SYSTEM_RESOURCES
    Scheduling task: MON_GET_AGENT
    Scheduling task: MON_GET_CONNECTION
    Scheduling task: MON_GET_ACTIVITY
    Scheduling task: MON_GET_UNIT_OF_WORK
    Scheduling task: MON_CURRENT_SQL
    Scheduling task: MON_GET_APPL_LOCKWAIT
    Scheduling task: MON_GET_EXTENDED_LATCH_WAIT
    Scheduling task: MON_GET_MEMORY_POOL
    Scheduling task: MON_GET_MEMORY_SET
    Scheduling task: MON_GET_TRANSACTION_LOG
    Scheduling task: DB_GET_CFG
    Scheduling task: DBMCFG
    Scheduling task: ENV_INST_INFO
    Scheduling task: ENV_GET_REG_VARIABLES
    Scheduling task: MON_GET_EXTENT_MOVEMENT_STATUS
    Scheduling task: MON_GET_GROUP_BUFFERPOOL
    Scheduling task: MON_GET_INDEX
    Scheduling task: MON_GET_LATCH
    Scheduling task: MON_GET_BUFFERPOOL
    Scheduling task: MON_GET_PAGE_ACCESS_INFO
    Scheduling task: MON_GET_PKG_CACHE_STMT
    Scheduling task: MON_GET_REBALANCE_STATUS
    Scheduling task: MON_GET_SERVERLIST
    Scheduling task: MON_GET_TABLE
    Scheduling task: MON_GET_TABLESPACE
    Scheduling task: MON_GET_UTILITY
    Scheduling task: MON_GET_WORKLOAD
    Scheduling task: VMSTAT
    Scheduling task: IOSTAT
    Scheduling task: NETSTAT
    Scheduling task: ARCHIVE
    Collection will begin in approximately 5 minutes ... 
 
Finally, the setup script is complete. Within about 5mins the Db2 Admin Task Scheduler should begin executing the data collection tasks, at their defined frequency.
 
Closing connection ...
Done


The Data Collection and Data Archive folders:

Approximately 5mins after executing the setup script, the Db2 Admin Task Scheduler should begin executing the data collection tasks, at their defined collection frequency.

Because I did not explicitly specify a --coll_path for the data collection, the default path within the database's DIAGPATH (usually ~/sqllib/db2dump/) is used, and a folder "IBMHIST_<dbname>" is created. Likewise for the --arch_path, a folder named "IBMHIST_<dbname>_archive" is created:

<db2inst1@hotellnx112> /home/db2inst1/sqllib/db2dump/
$ ls
 
db2ats_sps_bnd.log  
db2diag.log   
db2inst1.nfy  
IBMHIST_TESTDB1/       
IBMHIST_TESTDB1_archive/
stmmlog

Within the data collection folder, we can see that an hourly data collection sub-folder has been created:
 
<db2inst1@hotellnx112> /home/db2inst1/sqllib/db2dump/IBMHIST_TESTDB1/
$ ls
 
TESTDB1_2020062714/

And within this hourly data collection sub-folder, we see the output of each of the data collection tasks, at their defined frequencies.  In this example, I listed the directory contents approximately 9mins after the Db2 Admin Task Scheduler began executing the tasks:

<db2inst1@hotellnx112> /home/db2inst1/sqllib/db2dump/IBMHIST_TESTDB1/TESTDB1_2020062714/
$ ls
 
ENV_GET_SYSTEM_RESOURCES_202006271421.del        MON_GET_MEMORY_SET_202006271421.del
ENV_GET_SYSTEM_RESOURCES_202006271422.del        MON_GET_MEMORY_SET_202006271422.del
ENV_GET_SYSTEM_RESOURCES_202006271423.del        MON_GET_MEMORY_SET_202006271423.del
ENV_GET_SYSTEM_RESOURCES_202006271424.del        MON_GET_MEMORY_SET_202006271424.del
ENV_GET_SYSTEM_RESOURCES_202006271425.del        MON_GET_MEMORY_SET_202006271425.del
ENV_GET_SYSTEM_RESOURCES_202006271426.del        MON_GET_MEMORY_SET_202006271426.del
ENV_GET_SYSTEM_RESOURCES_202006271427.del        MON_GET_MEMORY_SET_202006271427.del
ENV_GET_SYSTEM_RESOURCES_202006271428.del        MON_GET_MEMORY_SET_202006271428.del
ENV_GET_SYSTEM_RESOURCES_202006271429.del        MON_GET_MEMORY_SET_202006271429.del
IOSTAT_202006271421.del                          MON_GET_PAGE_ACCESS_INFO_202006271421.del
IOSTAT_202006271422.del                          MON_GET_PAGE_ACCESS_INFO_202006271422.del
IOSTAT_202006271423.del                          MON_GET_PAGE_ACCESS_INFO_202006271423.del
IOSTAT_202006271424.del                          MON_GET_PAGE_ACCESS_INFO_202006271424.del
IOSTAT_202006271425.del                          MON_GET_PAGE_ACCESS_INFO_202006271425.del
IOSTAT_202006271426.del                          MON_GET_PAGE_ACCESS_INFO_202006271426.del
IOSTAT_202006271427.del                          MON_GET_PAGE_ACCESS_INFO_202006271427.del
IOSTAT_202006271428.del                          MON_GET_PAGE_ACCESS_INFO_202006271428.del
IOSTAT_202006271429.del                          MON_GET_PAGE_ACCESS_INFO_202006271429.del
MON_CURRENT_SQL_202006271421.del                 MON_GET_REBALANCE_STATUS_202006271421.del
MON_CURRENT_SQL_202006271424.del                 MON_GET_REBALANCE_STATUS_202006271422.del
MON_CURRENT_SQL_202006271427.del                 MON_GET_REBALANCE_STATUS_202006271423.del
MON_GET_ACTIVITY_202006271422.del                MON_GET_REBALANCE_STATUS_202006271424.del
MON_GET_ACTIVITY_202006271424.del                MON_GET_REBALANCE_STATUS_202006271425.del
MON_GET_ACTIVITY_202006271426.del                MON_GET_REBALANCE_STATUS_202006271426.del
MON_GET_ACTIVITY_202006271428.del                MON_GET_REBALANCE_STATUS_202006271427.del
MON_GET_AGENT_202006271422.del                   MON_GET_REBALANCE_STATUS_202006271428.del
MON_GET_AGENT_202006271424.del                   MON_GET_REBALANCE_STATUS_202006271429.del
MON_GET_AGENT_202006271426.del                   MON_GET_SERVERLIST_202006271421.del
MON_GET_AGENT_202006271428.del                   MON_GET_SERVERLIST_202006271422.del
MON_GET_APPL_LOCKWAIT_202006271421.del           MON_GET_SERVERLIST_202006271423.del
MON_GET_APPL_LOCKWAIT_202006271422.del           MON_GET_SERVERLIST_202006271424.del
MON_GET_APPL_LOCKWAIT_202006271423.del           MON_GET_SERVERLIST_202006271425.del
MON_GET_APPL_LOCKWAIT_202006271424.del           MON_GET_SERVERLIST_202006271426.del
MON_GET_APPL_LOCKWAIT_202006271425.del           MON_GET_SERVERLIST_202006271427.del
MON_GET_APPL_LOCKWAIT_202006271426.del           MON_GET_SERVERLIST_202006271428.del
MON_GET_APPL_LOCKWAIT_202006271427.del           MON_GET_SERVERLIST_202006271429.del
MON_GET_APPL_LOCKWAIT_202006271428.del           MON_GET_TRANSACTION_LOG_202006271421.del
MON_GET_APPL_LOCKWAIT_202006271429.del           MON_GET_TRANSACTION_LOG_202006271422.del
MON_GET_CONNECTION_202006271421.del              MON_GET_TRANSACTION_LOG_202006271423.del
MON_GET_CONNECTION_202006271424.del              MON_GET_TRANSACTION_LOG_202006271424.del
MON_GET_CONNECTION_202006271427.del              MON_GET_TRANSACTION_LOG_202006271425.del
MON_GET_EXTENDED_LATCH_WAIT_202006271421.del     MON_GET_TRANSACTION_LOG_202006271426.del
MON_GET_EXTENDED_LATCH_WAIT_202006271422.del     MON_GET_TRANSACTION_LOG_202006271427.del
MON_GET_EXTENDED_LATCH_WAIT_202006271423.del     MON_GET_TRANSACTION_LOG_202006271428.del
MON_GET_EXTENDED_LATCH_WAIT_202006271424.del     MON_GET_TRANSACTION_LOG_202006271429.del
MON_GET_EXTENDED_LATCH_WAIT_202006271425.del     MON_GET_UNIT_OF_WORK_202006271422.del
MON_GET_EXTENDED_LATCH_WAIT_202006271426.del     MON_GET_UNIT_OF_WORK_202006271424.del
MON_GET_EXTENDED_LATCH_WAIT_202006271427.del     MON_GET_UNIT_OF_WORK_202006271426.del
MON_GET_EXTENDED_LATCH_WAIT_202006271428.del     MON_GET_UNIT_OF_WORK_202006271428.del
MON_GET_EXTENDED_LATCH_WAIT_202006271429.del     MON_GET_UTILITY_202006271421.del
MON_GET_EXTENT_MOVEMENT_STATUS_202006271421.del  MON_GET_UTILITY_202006271422.del
MON_GET_EXTENT_MOVEMENT_STATUS_202006271422.del  MON_GET_UTILITY_202006271423.del
MON_GET_EXTENT_MOVEMENT_STATUS_202006271423.del  MON_GET_UTILITY_202006271424.del
MON_GET_EXTENT_MOVEMENT_STATUS_202006271424.del  MON_GET_UTILITY_202006271425.del
MON_GET_EXTENT_MOVEMENT_STATUS_202006271425.del  MON_GET_UTILITY_202006271426.del
MON_GET_EXTENT_MOVEMENT_STATUS_202006271426.del  MON_GET_UTILITY_202006271427.del
MON_GET_EXTENT_MOVEMENT_STATUS_202006271427.del  MON_GET_UTILITY_202006271428.del
MON_GET_EXTENT_MOVEMENT_STATUS_202006271428.del  MON_GET_UTILITY_202006271429.del
MON_GET_EXTENT_MOVEMENT_STATUS_202006271429.del  MON_GET_WORKLOAD_202006271425.del
MON_GET_INDEX_202006271425.del                   NETSTAT_202006271421.del
MON_GET_LATCH_202006271421.del                   NETSTAT_202006271422.del
MON_GET_LATCH_202006271422.del                   NETSTAT_202006271423.del
MON_GET_LATCH_202006271423.del                   NETSTAT_202006271424.del
MON_GET_LATCH_202006271424.del                   NETSTAT_202006271425.del
MON_GET_LATCH_202006271425.del                   NETSTAT_202006271426.del
MON_GET_LATCH_202006271426.del                   NETSTAT_202006271427.del
MON_GET_LATCH_202006271427.del                   NETSTAT_202006271428.del
MON_GET_LATCH_202006271428.del                   NETSTAT_202006271429.del
MON_GET_LATCH_202006271429.del                   task_details_copy.json
MON_GET_MEMORY_POOL_202006271421.del             VMSTAT_202006271421.del
MON_GET_MEMORY_POOL_202006271422.del             VMSTAT_202006271422.del
MON_GET_MEMORY_POOL_202006271423.del             VMSTAT_202006271423.del
MON_GET_MEMORY_POOL_202006271424.del             VMSTAT_202006271424.del
MON_GET_MEMORY_POOL_202006271425.del             VMSTAT_202006271425.del
MON_GET_MEMORY_POOL_202006271426.del             VMSTAT_202006271426.del
MON_GET_MEMORY_POOL_202006271427.del             VMSTAT_202006271427.del
MON_GET_MEMORY_POOL_202006271428.del             VMSTAT_202006271428.del
MON_GET_MEMORY_POOL_202006271429.del             VMSTAT_202006271429.del


After a few hours, I can see that each hours data collection folders has been tar+gzipped and moved to the archive folder:

(db2inst1@hotellnx112) /home/db2inst1/sqllib/db2dump/IBMHIST_TESTDB1_archive
$ ls
 
TESTDB1_2020062716.tar.gz
TESTDB1_2020062717.tar.gz
TESTDB1_2020062718.tar.gz
TESTDB1_2020062719.tar.gz



The IBMHIST Control Tables

As mentioned earlier, the historical monitoring framework uses a set of control tables in order to operate. We can see the tables here:

(db2inst1@hotellnx112) /home/db2inst1/db2histmon-master/1_setup/
$ db2 connect to testdb1
$ db2 list tables for schema IBMHIST

Table/View                      Schema          Type  Creation time
------------------------------- --------------- ----- --------------------------
TAB_CONFIG                      IBMHIST         T     2020-06-27-14.12.07.943550
TAB_DIRS                        IBMHIST         T     2020-06-27-14.12.08.907709
TAB_ERRS                        IBMHIST         T     2020-06-27-14.12.08.267541
TAB_TASKS                       IBMHIST         T     2020-06-27-14.12.08.521290


  • The IBMHIST.TAB_CONFIG table contains information about the configuration parameters/arguments that were passed into the setup.py script (or the default value, if they were not specified).

(db2inst1@hotellnx112) /home/db2inst1/
$ db2 "select CONFIG_NAME, VALUE, DESCRIPTION from IBMHIST.TAB_CONFIG"

CONFIG_NAME          VALUE                                    DESCRIPTION
-------------------- ---------------------------------------- ----------------------------------------
COLL_PATH            /home/db2inst1/sqllib/db2dump            DIRECTORY PATH OF DATA COLLECTION
ARCH_PATH            /home/db2inst1/sqllib/db2dump            DIRECTORY PATH OF DATA ARCHIVAL
MAX_SIZE             1073741824                               MAX SIZE OF COLLECTION AND ARCHIVAL
ARCH_CMD             tar -caf _dest_ _src_                    COMMAND USED TO ARCHIVE HOURLY DIRECTORI
ARCH_EXT             .tar.gz                                  EXTENSION OF ARCHIVE HOURLY DIRECTORIES
TASK_DETAILS_PATH    /home/hotellnx112/db2inst1/db2histmon-ma LOCATION OF task_details.json FILE

  6 record(s) selected.


  • The IBMHIST.TAB_DIRS table contains information about the current (and any previously specified) data collection paths (coll_path), and a history of each data archive path (arch_path), including the size of the archived file.

For example, we can see the current hourly data collection path as so:

(db2inst1@hotellnx112) /home/db2inst1/
$ db2 "select PATH, SIZE, STATUS, TIME from IBMHIST.TAB_DIRS where STATUS='COLL'"

PATH                                                                   SIZE                 STATUS TIME      
---------------------------------------------------------------------- -------------------- ------ --------------------------
/home/db2inst1/sqllib/db2dump/IBMHIST_TESTDB1/TESTDB1_2020062910                    5133004 COLL   2020-06-29-10.43.00.000000

  1 record(s) selected.


 Or we can see a list of all the previously archived hourly archived folders as so:

(db2inst1@hotellnx112) /home/db2inst1/
$ db2 "select PATH, SIZE, STATUS, TIME from IBMHIST.TAB_DIRS where STATUS='ARCH' order by TIME asc"

PATH                                                                   SIZE                 STATUS TIME      
---------------------------------------------------------------------- -------------------- ------ --------------------------
/home/db2inst1/sqllib/db2dump/IBMHIST_TESTDB1_archive/TESTDB1_20200627               328432 ARCH   2020-06-27-16.59.00.000000
/home/db2inst1/sqllib/db2dump/IBMHIST_TESTDB1_archive/TESTDB1_20200627              1901572 ARCH   2020-06-27-17.59.00.000000
/home/db2inst1/sqllib/db2dump/IBMHIST_TESTDB1_archive/TESTDB1_20200627              1923296 ARCH   2020-06-27-18.59.00.000000
...etc...

  15 record(s) selected.

  • The IBMHIST.TAB_TASKS table contains information about each data collection task, as was defined in the task_detail.json file when the full setup.py was last performed (or refreshed).
For example, we can list each active data collection task including it's operating system command or sql query:

(db2inst1@hotellnx112) /home/db2inst1/sqllib/
$ db2 "select COLL_NAME, CLASS, COMMAND from IBMHIST.TAB_TASKS"

COLL_NAME            CLASS COMMAND
-------------------- ----- ----------------------------------------------------------------------
ENV_GET_SYSTEM_RESOU SQL   SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( SYSPROC.
MON_GET_AGENT        SQL   SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_CONNECTION   SQL   SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_ACTIVITY     SQL   SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_UNIT_OF_WORK SQL   SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_CURRENT_SQL      SQL   SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM SYSIBMADM.MON_CU
MON_GET_APPL_LOCKWAI SQL   SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE( MON_GET_A
MON_GET_EXTENDED_LAT SQL   SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_MEMORY_POOL  SQL   SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE (MON_GET_M
MON_GET_MEMORY_SET   SQL   SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_TRANSACTION_ SQL   SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
DB_GET_CFG           SQL   SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( DB_GET_C
DBMCFG               SQL   SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM SYSIBMADM.DBMCFG
ENV_INST_INFO        SQL   SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM SYSIBMADM.ENV_IN
ENV_GET_REG_VARIABLE SQL   SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( ENV_GET_
MON_GET_EXTENT_MOVEM SQL   SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_GROUP_BUFFER SQL   SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_INDEX        SQL   SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_LATCH        SQL   SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_BUFFERPOOL   SQL   SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_PAGE_ACCESS_ SQL   SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_PKG_CACHE_ST SQL   SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_REBALANCE_ST SQL   SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_SERVERLIST   SQL   SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_TABLE        SQL   SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_TABLESPACE   SQL   SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_UTILITY      SQL   SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
MON_GET_WORKLOAD     SQL   SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE ( MON_GET_
VMSTAT               SYS   vmstat
IOSTAT               SYS   iostat
NETSTAT              SYS   netstat

For SQL class tasks, we can also list all the column names of the generated data by examining the HEADER column:

(db2inst1@hotellnx112) /home/db2inst1/
$ db2 "select COLL_NAME,  HEADER as COLUMN_NAMES from IBMHIST.TAB_TASKS where CLASS='SQL'"

COLL_NAME            COLUMN_NAMES
-------------------- ----------------------------------------------------------------------
ENV_GET_SYSTEM_RESOU COLLECTION_TIME,MEMBER,OS_NAME,HOST_NAME,OS_VERSION,OS_RELEASE,MACHINE
MON_GET_AGENT        COLLECTION_TIME,SERVICE_SUPERCLASS_NAME,SERVICE_SUBCLASS_NAME,APPLICAT
MON_GET_CONNECTION   COLLECTION_TIME,APPLICATION_HANDLE,APPLICATION_NAME,APPLICATION_ID,MEM
MON_GET_ACTIVITY     COLLECTION_TIME,APPLICATION_HANDLE,DBPARTITIONNUM,COORD_PARTITION_NUM,
MON_GET_UNIT_OF_WORK COLLECTION_TIME,SERVICE_SUPERCLASS_NAME,SERVICE_SUBCLASS_NAME,SERVICE_
MON_CURRENT_SQL      COLLECTION_TIME,COORD_MEMBER,APPLICATION_HANDLE,APPLICATION_NAME,SESSI
MON_GET_APPL_LOCKWAI COLLECTION_TIME,LOCK_WAIT_START_TIME,LOCK_NAME,LOCK_OBJECT_TYPE_ID,LOC
MON_GET_EXTENDED_LAT COLLECTION_TIME,MEMBER,LATCH_NAME,TOTAL_EXTENDED_LATCH_WAITS,TOTAL_EXT
MON_GET_MEMORY_POOL  COLLECTION_TIME,MEMBER,HOST_NAME,DB_NAME,MEMORY_SET_TYPE,MEMORY_POOL_T
MON_GET_MEMORY_SET   COLLECTION_TIME,MEMBER,HOST_NAME,DB_NAME,MEMORY_SET_TYPE,MEMORY_SET_ID
MON_GET_TRANSACTION_ COLLECTION_TIME,MEMBER,TOTAL_LOG_AVAILABLE,TOTAL_LOG_USED,SEC_LOG_USED
DB_GET_CFG           COLLECTION_TIME,NAME,VALUE,VALUE_FLAGS,DEFERRED_VALUE,DEFERRED_VALUE_F
DBMCFG               COLLECTION_TIME,NAME,VALUE,VALUE_FLAGS,DEFERRED_VALUE,DEFERRED_VALUE_F
ENV_INST_INFO        COLLECTION_TIME,INST_NAME,IS_INST_PARTITIONABLE,NUM_DBPARTITIONS,INST_
ENV_GET_REG_VARIABLE COLLECTION_TIME,MEMBER,REG_VAR_NAME,REG_VAR_VALUE,REG_VAR_ON_DISK_VALU
MON_GET_EXTENT_MOVEM COLLECTION_TIME,TBSP_NAME,TBSP_ID,MEMBER,CURRENT_EXTENT,LAST_EXTENT,NU
MON_GET_GROUP_BUFFER COLLECTION_TIME,MEMBER,NUM_GBP_FULL
MON_GET_INDEX        COLLECTION_TIME,TABSCHEMA,TABNAME,IID,MEMBER,DATA_PARTITION_ID,NLEAF,N
MON_GET_LATCH        COLLECTION_TIME,LATCH_NAME,MEMORY_ADDRESS,EDU_ID,EDU_NAME,APPLICATION_
MON_GET_BUFFERPOOL   COLLECTION_TIME,BP_NAME,MEMBER,AUTOMATIC,DIRECT_READS,DIRECT_READ_REQS
MON_GET_PAGE_ACCESS_ COLLECTION_TIME,TABSCHEMA,TABNAME,OBJTYPE,MEMBER,PAGE_RECLAIMS_X,PAGE_
MON_GET_PKG_CACHE_ST COLLECTION_TIME,MEMBER,SECTION_TYPE,INSERT_TIMESTAMP,EXECUTABLE_ID,PAC
MON_GET_REBALANCE_ST COLLECTION_TIME,TBSP_NAME,TBSP_ID,DBPARTITIONNUM,MEMBER,REBALANCER_MOD
MON_GET_SERVERLIST   COLLECTION_TIME,MEMBER,CACHED_TIMESTAMP,HOSTNAME,PORT_NUMBER,SSL_PORT_
MON_GET_TABLE        COLLECTION_TIME,TABSCHEMA,TABNAME,MEMBER,TAB_TYPE,TAB_FILE_ID,DATA_PAR
MON_GET_TABLESPACE   COLLECTION_TIME,TBSP_NAME,TBSP_ID,MEMBER,TBSP_TYPE,TBSP_CONTENT_TYPE,T
MON_GET_UTILITY      COLLECTION_TIME,MEMBER,COORD_MEMBER,APPLICATION_HANDLE,APPLICATION_ID,
MON_GET_WORKLOAD     COLLECTION_TIME,WORKLOAD_NAME,WORKLOAD_ID,MEMBER,ACT_ABORTED_TOTAL,ACT

  28 record(s) selected.

  • The IBMHIST.TAB_ERRS will contain a history of any execution errors that occurred by the historical monitoring framework:
(db2inst1@hotellnx112) /home/db2inst1/
$ db2 "select TIME, CALLER, COLL_NAME, ERRMSG from IBMHIST.TAB_ERRS"

TIME                       CALLER               COLL_NAME            4                                       
-------------------------- -------------------- -------------------- ------------------------------------------------------------
2020-06-27-14.51.29.200058 PROC_COLLECT         DUMMY_SQL_TASK       Failed to write to data collection path: /home/db2inst1/MYHI...etc...

   1 record(s) selected.
 
 

Example of modifying a setup parameter (changing the data collection path):


First we create a new folder to host the active data collection:
 
(db2inst1@hotellnx112) /home/db2inst1/
$ mkdir /home/db2inst1/MYHISTMON_COLLECTION_TESTDB1

Since the data collections are performed by external UDFs under a fenced userid (not the db2-instance-owner userid), our new folder must be accessible to the fenced userid, for simplicity I'm giving the folder permission to everyone:
 
(db2inst1@hotellnx112) /home/db2inst1/
$ chmod 777 MYHISTMON_COLLECTION_TESTDB1

Next I execute the setup script, specifying the --coll_path option with our new path. I also specify the --update_config_only option, since I did not modify the task_details.json file and  do not need to update all the data collection tasks.

(db2inst1@hotellnx112) /home/dbb2inst1/db2histmon-master/1_setup/
$ python3 setup.py testdb1 --coll_path /home/db2inst1MYHISTMON_COLLECTION_TESTDB1/ --update_config_only
 
Connecting to database: TESTDB1
Configuring IBMHIST settings ...
    Deleting configurations from IBMHIST.TAB_CONFIG ...
    Setting COLL_PATH to '/home/db2inst1/MYHISTMON_COLLECTION_TESTDB1' ...
    Setting ARCH_PATH to '/home/db2inst1/sqllib/db2dump' ...
    Setting MAX_SIZE to '1073741824' ...
    Setting ARCH_CMD to 'tar -caf _dest_ _src_' and ARCH_EXT to '.tar.gz' ...
    Setting TASK_DETAILS_PATH to '/home/hotellnx112/db2inst1/db2histmon-master/1_setup/task_details.json' ...
Closing connection ...
Done

All now all subsequent data collection tasks will be put into the new data collection path.
(Note that existing data in the old collection path will be automatically moved into the new path).

(db2inst1@hotellnx112) /home/db2inst1/MYHISTMON_COLLECTION_TESTDB1/IBMHIST_TESTDB1/TESTDB1_2020062913
$ ls

IOSTAT_202006291304.del
IOSTAT_202006291305.del
IOSTAT_202006291306.del
IOSTAT_202006291307.del
IOSTAT_202006291308.del
MON_GET_EXTENT_MOVEMENT_STATUS_202006291304.del
MON_GET_EXTENT_MOVEMENT_STATUS_202006291305.del
MON_GET_EXTENT_MOVEMENT_STATUS_202006291306.del
MON_GET_EXTENT_MOVEMENT_STATUS_202006291307.del
...etc...
...etc...


Example of Adding a new data collection task (a new operating system metric):

In this example, I want to add a new task to collect "ps -elf" output from the operating system, every minute.

First, we edit our task_details.json file to add a new entry to the bottom for our new task called 'PS_ELF' as so:
(note, I can probably just leave the collection_condition field blank, since I do not intend to run this task_details.json file in a Windows environment, and thus always executing this data collectiont ask would be fine, but for completeness I'll add UNIX as a collection_condition, since 'ps -elf' makes no sense in Windows).

<db2inst1@hotellnx112> /home/db2inst1/db2histmon-master/1_setup/
$ vim task_details.json

    ...etc...
    ...etc...
    ,
    {
        "collection_name": "PS_ELF",
        "collection_class": "SYS",
        "collection_command": "ps -elf",
        "collection_freq": "* * * * *",
        "collection_level": "1",
        "collection_condition": "UNIX",
        "loader_join_columns": "",
        "loader_diff_exempt_columns": "",
        "quickparse_summary_columns": ""
    }

Next, I executed the setup.py script.  Since I'm only adding a new task, I'll just use the --update_tasks_only option.  We can see our new PS_ELF task included:

<db2inst1@hotellnx112> /home/db2inst1/db2histmon-master/1_setup/
$ python3 setup.py testdb1 --update_tasks_only

Connecting to database: TESTDB1
Determining environment ...
    Determining if Windows or Unix ...
        Unix detected ...
    Determining if pureScale ...
        pureScale not detected ...
    Determining if HADR ...
        HADR not detected ...
Unscheduling tasks ...
Scheduling tasks ...
    Setting Admin Task Scheduler registry variable ...
    Reading from task_details.json file ...
    Scheduling task: ENV_GET_SYSTEM_RESOURCES
    Scheduling task: MON_GET_AGENT
    Scheduling task: MON_GET_CONNECTION
    Scheduling task: MON_GET_ACTIVITY
    Scheduling task: MON_GET_UNIT_OF_WORK
    Scheduling task: MON_CURRENT_SQL
    Scheduling task: MON_GET_APPL_LOCKWAIT
    Scheduling task: MON_GET_EXTENDED_LATCH_WAIT
    Scheduling task: MON_GET_MEMORY_POOL
    Scheduling task: MON_GET_MEMORY_SET
    Scheduling task: MON_GET_TRANSACTION_LOG
    Scheduling task: DB_GET_CFG
    Scheduling task: DBMCFG
    Scheduling task: ENV_INST_INFO
    Scheduling task: ENV_GET_REG_VARIABLES
    Scheduling task: MON_GET_EXTENT_MOVEMENT_STATUS
    Scheduling task: MON_GET_GROUP_BUFFERPOOL
    Scheduling task: MON_GET_INDEX
    Scheduling task: MON_GET_LATCH
    Scheduling task: MON_GET_BUFFERPOOL
    Scheduling task: MON_GET_PAGE_ACCESS_INFO
    Scheduling task: MON_GET_PKG_CACHE_STMT
    Scheduling task: MON_GET_REBALANCE_STATUS
    Scheduling task: MON_GET_SERVERLIST
    Scheduling task: MON_GET_TABLE
    Scheduling task: MON_GET_TABLESPACE
    Scheduling task: MON_GET_UTILITY
    Scheduling task: MON_GET_WORKLOAD
    Scheduling task: VMSTAT
    Scheduling task: IOSTAT
    Scheduling task: NETSTAT
    Scheduling task: PS_ELF
    Scheduling task: ARCHIVE
    Collection will begin in approximately 5 minutes ... 
Closing connection ...
Done

 

If we examine the IBMHIST.TAB_TASKS table, we can also see the new PS_ELF data collection task:

db2inst1@hotellnx112> /home/db2inst1/db2histmon-master/1_setup/
$ db2 "select COLL_NAME from IBMHIST.TAB_TASKS where COLL_NAME='PS_ELF'"

COLL_NAME                                                            
----------------------------------------------------------------
PS_ELF                                                          

  1 record(s) selected.

If we examine the  SYSTOOLS.ADMIN_TASK_LIST table, we can see that our PS_ELF task has been added to the admin task scheduler:

db2inst1@hotellnx112> /home/db2inst1/ 
$ db2 "SELECT NAME, PROCEDURE_NAME, REMARKS from SYSTOOLS.ADMIN_TASK_LIST"
 
NAME                           PROCEDURE_NAME       REMARKS
------------------------------ -------------------- ------------------------------
MON_GET_LATCH                  PROC_COLLECT         IBMHIST Collection
MON_GET_BUFFERPOOL             PROC_COLLECT         IBMHIST Collection
MON_GET_PAGE_ACCESS_INFO       PROC_COLLECT         IBMHIST Collection
MON_GET_PKG_CACHE_STMT         PROC_COLLECT         IBMHIST Collection
MON_GET_REBALANCE_STATUS       PROC_COLLECT         IBMHIST Collection
MON_GET_SERVERLIST             PROC_COLLECT         IBMHIST Collection
MON_GET_TABLE                  PROC_COLLECT         IBMHIST Collection
MON_GET_TABLESPACE             PROC_COLLECT         IBMHIST Collection
MON_GET_UTILITY                PROC_COLLECT         IBMHIST Collection
MON_GET_WORKLOAD               PROC_COLLECT         IBMHIST Collection
VMSTAT                         PROC_COLLECT         IBMHIST Collection
IOSTAT                         PROC_COLLECT         IBMHIST Collection
NETSTAT                        PROC_COLLECT         IBMHIST Collection
ARCHIVE                        PROC_ARCHIVE         IBMHIST Archival
PS_ELF                         PROC_COLLECT         IBMHIST Collection
MON_GET_FCM                    PROC_COLLECT         IBMHIST Collection
MON_GET_AGENT                  PROC_COLLECT         IBMHIST Collection
MON_GET_CONNECTION             PROC_COLLECT         IBMHIST Collection
MON_GET_ACTIVITY               PROC_COLLECT         IBMHIST Collection
MON_GET_UNIT_OF_WORK           PROC_COLLECT         IBMHIST Collection
MON_CURRENT_SQL                PROC_COLLECT         IBMHIST Collection
MON_GET_APPL_LOCKWAIT          PROC_COLLECT         IBMHIST Collection
MON_GET_EXTENDED_LATCH_WAIT    PROC_COLLECT         IBMHIST Collection
MON_GET_MEMORY_POOL            PROC_COLLECT         IBMHIST Collection
MON_GET_MEMORY_SET             PROC_COLLECT         IBMHIST Collection
MON_GET_TRANSACTION_LOG        PROC_COLLECT         IBMHIST Collection
DB_GET_CFG                     PROC_COLLECT         IBMHIST Collection
DBMCFG                         PROC_COLLECT         IBMHIST Collection
ENV_INST_INFO                  PROC_COLLECT         IBMHIST Collection
ENV_GET_REG_VARIABLES          PROC_COLLECT         IBMHIST Collection
MON_GET_EXTENT_MOVEMENT_STATUS PROC_COLLECT         IBMHIST Collection
MON_GET_GROUP_BUFFERPOOL       PROC_COLLECT         IBMHIST Collection
MON_GET_INDEX                  PROC_COLLECT         IBMHIST Collection
ENV_GET_SYSTEM_RESOURCES       PROC_COLLECT         IBMHIST Collection

  34 record(s) selected.



Lastly, if we wait approximately 5mins for the Db2 Admin Task Scheduler to process the new task, we can then begin seeing the output of our new PS_ELF task within the hourly data collection folder:

<db2inst1@hotellnx112> /home/db2inst1/sqllib/db2dump/IBMHIST_TESTDB1/TESTDB1_2020062717/
$ ls -l

DB_GET_CFG_202006271700.del
DBMCFG_202006271700.del
...etc... 
...etc... 
PS_ELF_202006271700.del
...etc...
...etc... 
 

Example of Adding a new data collection task  (a new sql query):

In this example, I want to add a new task to collect MON_GET_FCM() output every 10mins.

First, I'll edit the task_details.json file to add a new entry to the bottom for our new task called 'MON_GET_FCM', as so:

<db2inst1@hotellnx112> /home/db2inst1/db2histmon-master/1_setup/
$ vim task_details.json

    ...etc...
    ...etc...
    ,
    {
        "collection_name": "MON_GET_FCM",
        "collection_class": "SQL",
        "collection_command": "SELECT CURRENT TIMESTAMP AS COLLECTION_TIME, T.* FROM TABLE (MON_GET_FCM(-2)) T",
        "collection_freq": "10 * * * *",
        "collection_level": "1",
        "collection_condition": "",
        "loader_join_columns": "",
        "loader_diff_exempt_columns": "",
        "quickparse_summary_columns": ""
    }

Next, I'll invoke the setup.py script. Here I use the --update_tasks_only option.  We can see the new MON_GET_FCM task added:
 
<db2inst1@hotellnx112> /home/db2inst1/db2histmon-master/1_setup.
$ python3 setup.py testdb1 --update_tasks_only
 
Connecting to database: TESTDB1
Determining environment ...
    Determining if Windows or Unix ...
        Unix detected ...
    Determining if pureScale ...
        pureScale not detected ...
    Determining if HADR ...
        HADR not detected ...
Unscheduling tasks ...
Scheduling tasks ...
    Setting Admin Task Scheduler registry variable ...
    Reading from task_details.json file ...
    Scheduling task: ENV_GET_SYSTEM_RESOURCES
    Scheduling task: MON_GET_AGENT
    Scheduling task: MON_GET_CONNECTION
    Scheduling task: MON_GET_ACTIVITY
    Scheduling task: MON_GET_UNIT_OF_WORK
    Scheduling task: MON_CURRENT_SQL
    Scheduling task: MON_GET_APPL_LOCKWAIT
    Scheduling task: MON_GET_EXTENDED_LATCH_WAIT
    Scheduling task: MON_GET_MEMORY_POOL
    Scheduling task: MON_GET_MEMORY_SET
    Scheduling task: MON_GET_TRANSACTION_LOG
    Scheduling task: DB_GET_CFG
    Scheduling task: DBMCFG
    Scheduling task: ENV_INST_INFO
    Scheduling task: ENV_GET_REG_VARIABLES
    Scheduling task: MON_GET_EXTENT_MOVEMENT_STATUS
    Scheduling task: MON_GET_GROUP_BUFFERPOOL
    Scheduling task: MON_GET_INDEX
    Scheduling task: MON_GET_LATCH
    Scheduling task: MON_GET_BUFFERPOOL
    Scheduling task: MON_GET_PAGE_ACCESS_INFO
    Scheduling task: MON_GET_PKG_CACHE_STMT
    Scheduling task: MON_GET_REBALANCE_STATUS
    Scheduling task: MON_GET_SERVERLIST
    Scheduling task: MON_GET_TABLE
    Scheduling task: MON_GET_TABLESPACE
    Scheduling task: MON_GET_UTILITY
    Scheduling task: MON_GET_WORKLOAD
    Scheduling task: VMSTAT
    Scheduling task: IOSTAT
    Scheduling task: NETSTAT
    Scheduling task: PS_ELF
    Scheduling task: MON_GET_FCM
    Scheduling task: ARCHIVE
    Collection will begin in approximately 5 minutes ...
Closing connection ...
Done


We can see our new task ash been added to the IBMHIST.TAB_TASKS table:
 
<db2inst1@hotellnx112> /home/db2inst1/
$ db2 "select COLL_NAME from IBMHIST.TAB_TASKS where COLL_NAME='MON_GET_FCM'"

COLL_NAME
----------------------------------------------------------------
MON_GET_FCM

  1 record(s) selected.

Lastly, if we wait approximately 5mins for the Db2 Admin Task Scheduler to process the new task, we can then begin seeing the output of our new MON_GET_FCM task within the hourly data collection folder:

<db2inst1@hotellnx112> /home/db2inst1/sqllib/db2dump/IBMHIST_TESTDB1/TESTDB1_2020062914/
$ ls -l

DB_GET_CFG_202006291400.del 
DBMCFG_202006291400.del
...etc...  
...etc...  
MON_GET_FCM_202006291439.del
...etc...  
...etc...  

Example of Removing an data collection task:

The cleanest way to remove a data collection task is to edit the task_details.json file, remove the desired task entry, and then run the setup.py script again, using the --update_tasks_only option.

For example, if I wanted to remove the iostat data collection task, I would remove this entry from the task_details.json file:

<db2inst1@hotellnx112> /home/db2inst1/db2histmon-master/1_setup/
$ vim task_details.json

    ...etc...
    ...etc...
    {
        "collection_name": "IOSTAT",
        "collection_class": "SYS",
        "collection_command": "iostat",
        "collection_freq": "* * * * *",
        "collection_level": "1",
        "collection_condition": "UNIX",
        "loader_join_columns": "",
        "loader_diff_exempt_columns": "",
        "quickparse_summary_columns": ""
    },
    ...etc...
    ...etc...

And the re-run the setup.py script using the --update_tasks_only option: 

<db2inst1@hotellnx112> /home/db2inst1/db2histmon-master/1_setup/
$ python3 setup.py testdb1 --update_tasks_only
 
Connecting to database: TESTDB1
Determining environment ...
    Determining if Windows or Unix ...
        Unix detected ...
    Determining if pureScale ...
        pureScale not detected ...
    Determining if HADR ...
        HADR not detected ...
Unscheduling tasks ...
Scheduling tasks ...
    Setting Admin Task Scheduler registry variable ...
    Reading from task_details.json file ...
    Scheduling task: ENV_GET_SYSTEM_RESOURCES
    Scheduling task: MON_GET_AGENT
    Scheduling task: MON_GET_CONNECTION
    Scheduling task: MON_GET_ACTIVITY
    Scheduling task: MON_GET_UNIT_OF_WORK
    Scheduling task: MON_CURRENT_SQL
    Scheduling task: MON_GET_APPL_LOCKWAIT
    Scheduling task: MON_GET_EXTENDED_LATCH_WAIT
    Scheduling task: MON_GET_MEMORY_POOL
    Scheduling task: MON_GET_MEMORY_SET
    Scheduling task: MON_GET_TRANSACTION_LOG
    Scheduling task: DB_GET_CFG
    Scheduling task: DBMCFG
    Scheduling task: ENV_INST_INFO
    Scheduling task: ENV_GET_REG_VARIABLES
    Scheduling task: MON_GET_EXTENT_MOVEMENT_STATUS
    Scheduling task: MON_GET_GROUP_BUFFERPOOL
    Scheduling task: MON_GET_INDEX
    Scheduling task: MON_GET_LATCH
    Scheduling task: MON_GET_BUFFERPOOL
    Scheduling task: MON_GET_PAGE_ACCESS_INFO
    Scheduling task: MON_GET_PKG_CACHE_STMT
    Scheduling task: MON_GET_REBALANCE_STATUS
    Scheduling task: MON_GET_SERVERLIST
    Scheduling task: MON_GET_TABLE
    Scheduling task: MON_GET_TABLESPACE
    Scheduling task: MON_GET_UTILITY
    Scheduling task: MON_GET_WORKLOAD
    Scheduling task: VMSTAT
    Scheduling task: NETSTAT
    Scheduling task: PS_ELF
    Scheduling task: MON_GET_FCM
    Scheduling task: ARCHIVE
    Collection will begin in approximately 5 minutes ...
Closing connection ...
Done


Example of Stopping all data collection tasks:

The easiest way to stop all data collection tasks is to simply disable and remove the historical monitoring framework from the database, using the setup.py --cleanup argument:

<db2inst1@hotellnx112> /home/db2inst1/db2histmon-master/1_setup/
$ python3 setup.py testdb1 --cleanup
 
Connecting to database: TESTDB1
Unscheduling tasks ...
Dropping IBMHIST schema and its objects ...
Closing connection ...
Done


You can validate that no data collection tasks are scheduled within the Admin Task Scheduler by:

<db2inst1@hotellnx112> /home/db2inst1/db2histmon-master/1_setup.
$ db2 "SELECT NAME, PROCEDURE_NAME from SYSTOOLS.ADMIN_TASK_LIST"

NAME                    PROCEDURE_NAME                                                                           
----------------------- --------------------------------------------------------
  
 0 record(s) selected.




Stay tuned for new blogs expanding on the Db2 Historical Monitoring (db2histmon) suite, coming soon.



No comments:

Post a Comment

Translate