Monitoring and measuring performance metrics are essential
practices for maintaining and optimizing the performance of an IBM Db2 environment. By regularly monitoring and analyzing performance data, DBAs can identify bottlenecks, proactively address performance issues, and make
informed decisions to improve overall system efficiency.

It is important to monitor and measure performance
metrics in order to gain insights into the behavior of the applications and databases in use at your site. By examining their behavior and
identifying areas for improvement, you can improve the overall satisfaction of your customers and end users. 

Performance metrics provide valuable
information about resource utilization, query response times, disk I/O, CPU
usage, and other critical aspects of database performance. By tracking these
metrics over time, administrators can detect patterns, identify trends, and
pinpoint potential performance bottlenecks.

A Strategy

The first part of any Db2 performance management strategy should be
to provide a comprehensive approach to the monitoring of the Db2 subsystems
operating at your site. This approach involves monitoring not only the threads
accessing Db2 and the SQL they issue, but also the Db2 address spaces. You can
accomplish this task in three ways:

  • Batch reports run against DB2 trace records.
    While DB2 is running, you can activate traces that accumulate information,
    which can be used to monitor both the performance of the DB2 subsystem and the
    applications being run.
  • Online
    access to DB2 trace information and DB2 control blocks. This type of monitoring
    also can provide information on DB2 and its subordinate applications.
  • Sampling
    DB2 application programs as they run and analyzing which portions of the code
    use the most resources.

Using all three approaches can be a reasonable approach if you have analyzed the type of workloads in use and which types of monitoring are most effective for each. 
You also need to establish a strategy for your organization’s tuning goals. And your tuning goals
should be set using the discipline of service level management (SLM). A service
level is a measure of operational behavior. SLM ensures applications behave
accordingly by applying resources to those applications based on their
importance to the organization. Depending on the needs of the organization, SLM
can focus on availability, performance, or both. In terms of availability, the
service level can be defined as “99.95% up time, during the hours of 9:00 AM to
10:00 PM on weekdays.” Of course, a service level can be more specific, stating
“average response time for transactions will be two seconds or less for
workloads of strategic external clients.”

For a service
level agreement (SLA) to be successful, all of the parties involved must agree
upon stated objectives for availability and performance. The end users must be
satisfied with the performance of their applications, and the DBAs and
technicians must be content with their ability to manage the system to the
objectives. Compromise is essential to reach a useful SLA.

Furthermore, the
objectives of the SLA must be defined and measurable. For example, in the
earlier SLA you must define what a “strategic client” is and differentiate
strategic from nonstrategic. Although doing this may be difficult, failing to
do so can result in worthless SLAs that cannot be achieved.

In the end, if
you do not identify service levels for each transaction, then you will always
be managing to an unidentified requirement. Without a predefined and agreed
upon SLA, how will the DBA and the end users know whether an application is
performing adequately? Without SLAs, business users and DBAs might have
different expectations, resulting in unsatisfied business executives and
frustrated DBAs. Not a good situation.

Db2 Traces

One of the first types of performance metrics to consider is
monitoring based on reading trace information. You can think of a Db2 trace as
a window into the performance characteristics of aspects of your Db2 workload.
Db2 traces record diagnostic information describing particular events. As Db2
operates, it writes trace information that can be read and analyzed to obtain
performance information.

Db2 provides six types of traces, and each describes information
about the Db2 environment:

  • Accounting – Records
    performance information about the execution of DB2 application programs
  • Audit – Provides information about DB2 DDL,
    security,
    ­utilities, and data modification
  • Global – Provides information for the
    servicing of DB2
  • Monitor – Records data useful for online
    monitoring of the DB2 subsystem and DB2 application programs
  • Performance – Collects
    detailed data about DB2 events, enabling database and performance analysts to
    pinpoint the causes of performance problems
  • Statistics – Records
    information regarding the DB2 subsystem’s use of resources

You can start Db2 traces in two ways: by specifying the appropriate DSNZPARMs at Db2 startup
or by using the -START TRACE command to initiate specific traces when Db2 is already running.

Each trace is broken down further into classes, each of which
provides information about aspects of that trace. Additional informatiohn about the classes for each type of trace can be found here, here, and here.

You should understand what type of information is traced and the approximate overhead of each trace class before starting any of the Db2 traces. Some traces are better left off until or unless a performance problem is occurring, at which point, the trace can be started to capture details about the situation. Others are better to have turned on before problems occur. Keep in mind that you will have some trace classes (and IFCIDs) that are always started, and other that are only used in emergencies.

What are IFCIDs?

Each trace class is associated with specific trace events known as
Instrumentation Facility Component Identifier (IFCIDs), pronounced “if-kid.” An
IFCID defines a record that represents a trace event. IFCIDs are the single
smallest unit of tracing that can be invoked
by Db2.

In some cases, it can make sense to avoid activating trace classes
altogether and start traces specifying only the IFCIDs needed. This way, you
can reduce the overhead associated with tracing by recording only the trace
events needed. 

There are several hundred different IFCIDs. Most IFCIDs contain
data fields that describe events pertinent to the event being traced. Some
IFCIDs have no data; instead they merely mark a specific time. 
Certain trace events of extended durations require a pair of
IFCIDs: one for the beginning of the event and another for the end. These pairs
enable the computation of elapsed times. Other trace events that are not as
lengthy require only a single IFCID. 

You can find the IFCIDs associated with each trace class in the IBM Db2Command Reference manual in the section describing the -START TRACE command. But
that manual does not describe the purpose of each IFCID. A list describing each
IFCID can be found in the data set named  SDSNIVPD(DSNWMSGS), which is part of the Db2 installation

Db2 Performance Monitors

Several popular performance monitoring solutions are available for
Db2 for z/OS. IBM’s Omegamon, BMC Software’s MainView, Broadcom’s Sysview,
and Rocket Software’s TMON are the leading performance monitors. When
selecting a performance monitor, be sure to examine the online components as
well as the batch reporting capabilities of the monitor. 

An online performance monitor is a tool that provides real-time
reporting on Db2 performance statistics as Db2 operates. In contrast, a batch
performance monitor reads previously generated trace records from an input data
set. Most performance monitors today can be used to serve both purposes.

With online DB2 performance monitors, you can perform proactive
performance management tasks. In other words, you can set up the monitor such
that when it detects a problem it alerts a DBA and possibly takes actions on
its own to resolve the problem.

The most common way to provide online performance monitoring
capabilities is by online access to DB2 trace information in the MONITOR trace class. 
Some online DB2 performance monitors also provide direct access to
Db2 performance data by reading the control blocks of the Db2 and application
address spaces. This type of monitoring provides a “window” to up-to-the-minute
performance statistics while Db2 runs. This information is important if a quick
reaction to performance problems is required.

Most online Db2 performance monitors provide a menu-driven
interface accessible from TSO or VTAM. It enables online performance monitors
to start and stop traces as needed based on the menu options chosen by the user.
Consequently, you can reduce overhead and diminish the learning curve involved
in understanding Db2 traces and their correspondence to performance reports.

Following are some typical uses of online performance monitors.
Many online performance monitors can establish effective exception-based
monitoring. When specified performance thresholds are reached, triggers can
offer notification and take action. For example, you could set a trigger when
the number of lock suspensions for the TXN00002 plan is reached; when the
trigger is activated, a message is sent to the console, and a batch report is
generated to provide accounting detail information for the plan. You can set
any number of triggers for many thresholds. Following are some examples of thresholds you might choose to set:

  • When a buffer pool threshold is reached (PREFETCH DISABLED, DEFERRED WRITE THRESHOLD, or DM CRITICAL THRESHOLD).
  • For
    critical transactions, when predefined performance objectives are not met. For
    example, if TXN00001 requires subsecond response time, set a trigger to notify a DBA
    when the transaction receives a class 1 accounting elapsed time exceeding 1
    second by more than 25%.
  • Many
    types of thresholds can be established. Most online monitors support this
    capability. As such, you can customize the thresholds for the needs of your DB2
    environment.

Most online performance monitors can produce real-time EXPLAINs for
long-running SQL statements. If an SQL statement takes a significant amount of
time to process, an analyst can display the SQL statement as it executes and
dynamically issue an EXPLAIN for the statement. Even as the statement executes, an
understanding of why it takes so long to run can be achieved. A complete
discussion of the EXPLAIN statement is provided in the next chapter.

Some online performance monitors provide historical trending.
These monitors track performance statistics and store them in DB2 tables or in
VSAM files with a timestamp. They also provide the capability to query these
stores of performance data to assist in the following:

  • Analyzing recent history. Most SQL statements
    execute quickly, making difficult the job of capturing and displaying
    information about the SQL statement as it executes. However, you might not want
    to wait until the SMF data is available to run a batch report. Quick access to
    recent past-performance data in these external data stores provides a type of
    online monitoring that is as close to real time as is usually needed.
  • Determining
    performance trends, such as a transaction steadily increasing in its CPU
    consumption or elapsed time.
  • Performing
    capacity planning based on a snapshot of the recent performance of DB2
    applications.

Some monitors also run when Db2 is down to provide access to the
historical data accumulated by the monitor.

A final benefit of online DB2 performance
monitors is their capability to interface with other z/OS monitors for IMS/TM,
CICS, z/OS, or VTAM. This way, an analyst gets a view of the entire spectrum of
system performance. Understanding and analyzing the data from each of these
monitors, however, requires a different skill. Quite often, one person cannot
master all these monitors

Conclusion 

Although this blog entry was brief, and there are many additional aspects to Db2 performance monitoring, monitoring and measuring performance metrics should be viewed as a vital requirement for all organizations using Db2 databases. By collecting and analyzing performance data, DBAs can
detect performance bottlenecks, identify areas for improvement, and make
informed decisions to enhance overall system efficiency.