How healthy is your z/OS on zD&T? (How to monitor z/OS Health)

Off
Strongback Consulting

A good systems programmer should generally know how to monitor the health of his or her z/OS system. However, if you’re using z/OS Development and Test (ZD&T), you’re likely a developer with only the basic skills. Fortunately, the product comes with Health Checker for z/OS, and it should be configured to run out of the box with a plethora of general purpose checks.

A Primer on z/OS Health Checker

The Health Checker is actually is a component of MVS that can diagnose potential problems before they adversely impact your system. It it is not a monitoring or diagnostic tool, but more of a validator that checks your system for derivations from standard and best practices. At work is a set of programs (called checks), that are run on a frequency by a started task (HZSPROC). It will run checks periodically and store the results in a sequential dataset, typically ADCD.&SYSNAME..HZSPDATA (as defined in the proc on zD&T).

How to get the health check output

If you’ve seen the following in your zD&T log, then you might be wondering, what does that mean?

OPRMSG: HZS0001I CHECK(IBMCSV,CSV_APF_EXISTS):
OPRMSG: CSVH0957E Problem(s) were found with data sets in the APF list.

In this case, it means that the one of the health checks, IBMCSV, has run, and it specifically looks at the rule CSV_APF_EXISTS, which checks to make sure all the APF authorized datasets actually exists. However, this entry in the log only indicates that it ran. It does not tell you which datasets were not found.

To get all the details, you’ll run a JCL job against the Health Checker system which will spill out information from its storage. There is a sample JCL, HZSPRINT located in SYS1.SAMPLIB that you can copy and tailor to your liking. In a nutshell, its a job that queries the storage, gets the output, and stores it in a readable format wherever you want (in a dataset, a USS log file, or a JES SYSOUT). In my case above, I tailor it to query for only the CSV_APF_EXISTS check and spit the output to SYSOUT as follows:

//HZSPRINT EXEC PGM=HZSPRNT,TIME=1440,REGION=0M,PARMDD=SYSIN
//SYSIN DD *
CHECK(IBMCSV,CSV_APF_EXISTS)
,EXCEPTIONS
//SYSOUT DD SYSOUT=A,DCB=(LRECL=256)

This spits out the info I need to determine which datasets are missing, via the SYSOUT.

IBM z/OS Explorer JES View

Opening up the SYSOUT from the JES spool, I see the following:

*
* Start: CHECK(IBMCSV,CSV_APF_EXISTS)                                  *
* 
 
 CHECK(IBMCSV,CSV_APF_EXISTS)
 SYSPLEX:    ADCDPL    SYSTEM: S0W1
 START TIME: 04/14/2021 07:18:37.465875
 CHECK DATE: 20071120  CHECK SEVERITY: LOW
 CHECK PARM: MIGRATEDOK(SYSTEM)
 CSVH0955I A problem was found with each APF list entry displayed.
 VOLUME DSNAME                                       ERROR
 A4CFG1 NETVIEW.V621USER.VTAMLIB                     DS not found
 Low Severity Exception * 
 CSVH0957E Problem(s) were found with data sets in the APF list.

In my case the correct dataset was NETVIEW.VTAMLIB, so I made the correction to my parmlib member and all is well. There should be other output you can check for as well, as a CHECK(*,*) in your JCL member would give you.

How to see what checks are configured

To display which checks are configured to run, you issue a system operator modify command to the HZSPROC as such:

f hzsproc,display,checks,check=(IBMCSV,*),detail

In this case, it should spit out all the Content Supervision Checks (IBMCSV) runs, which includes

Checks are configured in the HZSPRMxx member in you system parmlib. You can change the checks here, and the changes will be reflected in your next IPL. You can also issue changes dynamically using the modify command above. See this cheat sheet for examples.

More more details on what checks are available, see the IBM Health Checker for z/OS checks – IBM Documentation.

Comments are closed.