- David Sciaraffa, Software Engineering Manager – IBM Db2
(This blog is a follow-up to my previous blog which describes the Db2 Historical Monitoring (db2histmon) framework.)
The Db2 Historical Monitoring (db2histmon) scripts (available here) collect a broad range of Db2 and Operating System diagnostic data, and retain this data for a period of time, allowing for basic triage of many types of issues. Diagnostic information is collected at various intervals (ranging from 1min to 1hour, depending on the kind of data), and can be easily customized and expanded. The scripts will collect data only while the database is activated, which can help to reduce the generation of unnecessary redundant data.
This blog provides an overview of the new report generation script.
The new report generation script can be used to scrape previously generated db2histmon data collections and produce summary reports in multiple flavours, similar to the reports produced by the MONREPORT module. The reports are useful for for analyzing various metrics, and highlighting statistical outliers, from previous time periods.
Downloading the db2histmon report script:
The full set of db2histmon scripts can be downloaded from the open-source Github repo: https://github.com/IBM/db2histmon.The report.py script itself is located in the 4_report/ sub-folder:(db2inst1@myhost1) /home/db2inst1/db2histmon-master/4_report/$ ls1_setup/2_loader/3_quickparser/4_report/
Pre-requisites:
- The scripts can be deployed in any Unix or Windows environment where Db2 is supported.
- Python2 or higher.
- The python pandas, argparse, and numpy modules are also required to execute the report.py script.
Using pip to install the pandas, argparse, and numpy modules:
(db2inst1@myhost1) /home/db2inst1/$ pip install argparse(db2inst1@myhost1) /home/db2inst1/Collecting argparseDownloading argparse-1.4.0-py2.py3-none-any.whl (23 kB)Installing collected packages: argparseSuccessfully installed argparse-1.4.0$ pip install numpy(db2inst1@myhost1) /home/db2inst1/Collecting numpyDownloading numpy-1.19.4-cp38-cp38-win_amd64.whl (13.0 MB)|| 13.0 MB 656 kB/sInstalling collected packages: numpySuccessfully installed numpy-1.19.4$ pip install pandasCollecting pandasDownloading pandas-1.1.4-cp38-cp38-win_amd64.whl (8.9 MB)|| 8.9 MB 1.1 MB/sCollecting python-dateutil>=2.7.3Downloading python_dateutil-2.8.1-py2.py3-none-any.whl (227 kB)|| 227 kB 1.7 MB/sInstalling collected packages: python-dateutil, pandasSuccessfully installed pandas-1.1.4 python-dateutil-2.8.1
Types of reports:
The report.py script is currently capable of generating three unique types of reports:DBSUMMARY - Provides a summary of metrics for the entire database, similar to the MONREPORT.DBSUMMARY module.CONNECTION - Provides a summary of metrics for each connection, similar to the MONREPORT.CONNECTION module.PKGCACHE - Provides a summary of metrics for the top executed statements that are still cached within the package-cache, similar to the MONREPORT.PKGCACHE module.A report will display various metrics and data points for every data collection time period (or limited only to the time periods specified in the arguments), as described later.
Historical data which is used to generate the report:
The report.py script will scrape historical monitoring data that was previously collected by the Db2 Historical Monitoring (db2histmon) framework.Please see the 'The Data Collection and Data Archive folders' section of my previous blog which describes the Db2 Historical Monitoring (db2histmon) framework, if you're unsure where this historic monitoring data may reside.
For reference, I'm including a screen-shot showing the contents of the db2histmon data collection folder:
Usage and options:
The report.py script has the following usage options:
(db2inst1@myhost1) /home/db2inst1/db2histmon-master/4_report/
$ python report.py -h
usage: report.py [-h] [-r {dbsummary,connection,pkgcache}]
[-st START_TIME] [-et END_TIME] [-p PERIOD]
[-s] [-m MEMBERS [MEMBERS ...]]
[-ah APPLICATION_HANDLES [APPLICATION_HANDLES ...]] path
positional arguments:path path where collection directories are storedoptional arguments:-h, --help show this help message and exit-r, --report {dbsummary,connection,pkgcache},name of report to be generated-st START_TIME, --start_time START_TIMEdisplay collections after this time, format can be (YYYY-mm-dd HH:MM:SS) or copied from acollection (YYYY-mm-dd-HH-MM-SS.ffffff)-et END_TIME, --end_time END_TIMEdisplay collections before this time, format can be (YYYY-mm-dd HH:MM:SS) or copied from acollection (YYYY-mm-dd-HH-MM-SS.ffffff)-p PERIOD, --period PERIODdisplay collections at a lesser frequency, for example an interval of 3 will show every thirdcollection-s, --stats show min, max, mean, std for each series of values-m MEMBERS [MEMBERS ...], --members MEMBERS [MEMBERS ...]filter certain reports by member-ah APPLICATION_HANDLES [APPLICATION_HANDLES ...], --application_handles APPLICATION_HANDLES [APPLICATION_HANDLES ...]filter certain reports by application handle
Use the --report option to specify the type of report to generate, either DBSUMMARY, CONNECTION, or PKGCACHE.
The last argument specifies the folder path where the db2histmon historical monitoring data resides.
I'll explain the other options within the examples further below.
Example 1. Generate a DBSUMMARY report (on all available historical monitoring data)
(db2inst1@myhost1) /home/db2inst1/db2histmon-master/4_report/
$ python report.py -r DBSUMMARY /path/to/db2histmon/data_files/
Example 2. Generate a report on a subset of time-intervals, by using the START_TIME and END_TIME options
In some cases, the quantity of db2histmon historical monitoring data collections available or archived is very large and may make the reports too large or unreadable.
We can use the --START_TIME and --END_TIME options to limit the data collection time-intervals that we wish to generate a report on.
In this example, we wish to include all data collections that were generated after 2020-09-29 02:20 but before 2020-09-29 04:30:
(db2inst1@myhost1) /home/db2inst1/db2histmon-master/4_report/
$ python report.py -r DBSUMMARY --start_time "2020-09-29 02:20:00" --end_time "2020-09-29 04:30:00" /path/to/db2histmon/data_files/
Example 3. Generate a report with time-intervals condensed into larger time PERIODs.
We can also use the --PERIOD option to combine time-intervals from the data collection set into larger sets, such that fewer (but longer duration) time-intervals are displayed in the report.
In this example, many of our original db2histmon historical monitoring data collections are approximately 5minutes apart (as an be seen in the previous report's example above). By specifying the --PERIOD 3 option, we aggregate the data which would normally be reported in three separate time-periods into one time-period:
(db2inst1@myhost1) /home/db2inst1/db2histmon-master/4_report/
$ python report.py -r DBSUMMARY --period 3 /path/to/db2histmon/data_files/
Example 4. Generate a report showing MIN, MAX, MEAN statistical values.
Understanding the statistical values associated with a particular metric can be useful when looking for anomalies in the data set. In this example, I specify the --stats option to include the statistics for each reported value:
(db2inst1@myhost1) /home/db2inst1/db2histmon-master/4_report/
$ python report.py -r DBSUMMARY --period 3 /path/to/db2histmon/data_files/