Level: Introductory Yongli An (yongli@ca.ibm.com), DB2 Performance Engineer, IBM Toronto Lab Peter Shum (shum@ca.ibm.com), DB2 e-business Performance Manager, IBM Toronto Lab
11 Jan 2002 This paper focuses on a number of DB2 tuning tips based on lessons learned from running Online Transaction Processing (OLTP)-type performance benchmarks (TPC-C, TPC-W, Trade2, etc.). Performance of a database application can be influenced by many factors. This paper concentrates on the configuration aspect of DB2 and does not take into account details with regard to capacity planning, database design, or application design with the exception of a few brief reminders.
©2002 International Business Machines Corporation. All rights reserved.
Introduction
DB2® Universal Database™ (UDB) is the first
multimedia, Web-ready relational database management system that is
strong enough to meet the demands of large corporations and
flexible enough to serve small- and medium-sized businesses. DB2
product family software together with Internet technology makes
information easily accessible, available, and secure across diverse
platforms. There are more than 60 million DB2 users from over
300,000 companies worldwide relying on IBM data management
solutions.
DB2 UDB powers the most demanding e-business
applications, such as electronic commerce, enterprise resource
planning, customer relationship management, supply-chain
management, Web self-service, and business intelligence. It is the
scalable, industrial-strength database ideal as the data management
foundation for your evolution into e-business.
Online transaction processing (OLTP) is a class
of application that facilitates and manages transaction-oriented
applications, typically for data-entry and retrieval transactions
in a number of industries, including banking, airlines, mail order,
supermarkets, and manufacturers. Typically, an OLTP workload
consists of many concurrently running short transactions. Today's
online transaction processing increasingly requires support for
transactions that span a network and may include more than one
company. For this reason, new OLTP software uses client/server
processing and brokering software that allows transactions to run
on different computer platforms in a network.
Performance is one of the most important
factors in any type of database system. This article focuses on a
number of DB2 performance tuning tips based on lessons learned from
running OLTP-type performance benchmarks (TPC-C, TPC-W, Trade2,
etc.). Although performance of a database application can
influenced by many factors, we focus on configuration rather than
on such things as capacity planning, database design, or
application design.
This article is organized as follows:
Performance
fundamentals
Updating catalog
statistics, which emphasizes the importance of collecting and
maintaining up-to-date database statistics, the lack of which is
often the source of many of the performance problems.
Monitoring and tuning
database configuration parameters, which describes a list of
database manager parameters and database parameters in the order of
importance. Normally it's not necessary to go through the whole
list to achieve your performance goal. You can just try several of
them at the top of the list to see if there's any performance
improvement.
With these tips, you can get your OLTP
application up and running with reasonably good performance.
Performance
fundamentals
- Have enough memory.
- For a 32-bit system, use at least 512 MB of RAM per CPU, up to
4 GB per machine, to support the buffer pools, DB2 agents, and
other shared memory objects required for a large number of
concurrent users. (See Buffer pool
size (BUFFPAGE) for more information on buffer pools.) More
memory may be needed to support applications that run locally or as
stored procedures. On AIX®, additional memory can be used by the
JFS file cache to supplement the buffer pool.
- For a 64-bit system, the buffer pool can be practically any
size. However, for most e-commerce OLTP applications that use a
large database, the buffer pool doesn't really need to be more than
8 GB. Bigger is still better, but at some point you'll experience
diminishing returns as the buffer pool hit ratio moves to the 98+%
range. The number of concurrent users (with its impact on the
number of DB2 agents) determines how much more memory is
required.
- The amount of memory required by each user connection into the
database (that is, a DB2 agent) depends on the nature of the SQL
statements performed by the application -- such as the number of
concurrent cursors opened and the amount of sorting and temp space
required. For OLTP applications, there should be less sorting and
temp space required and only a handful of concurrent cursors opened
at a time.
-
Rule of thumb: Use a minimum of 1 MB for
UNIX and 500 KB for Windows for each DB2 agent. If fenced stored
procedures are used, then each user connection has two DB2 agents,
in addition to the memory required to run the stored procedure
application.
- Have sufficient I/O handling capability.
- There must be enough disk arms to ensure sufficient I/O
parallelism to support a high volume of concurrent transactions.
There should be at least 5 to 10 disks per CPU for a moderate
workload, and 20 disks for a high-I/O OLTP workload. The operating
system (including paging space), DB2 logs, and DB2 table spaces
should each have their own dedicated disks. There should be
multiple disks for the DB2 logs, for tables, and for indexes.
- The proper way to estimate the I/O handling capability that is
needed for good performance is actually to prototype the
transactions and find out how many I/Os are required per
transaction, and how many transactions per second are required.
Then find out the I/O rate for the disk controller and the disk
subsystem to help determine how many controllers and disks are
required.
- Have sufficient network bandwidth.
There must be enough network bandwidth to support the workload.
Make sure that the network or any intermediate hubs are not a
bottleneck. This is especially significant when supporting access
from remote location. For example, a T1 line supports 1.544
Mbit/sec, which is only 0.193 MB/sec, whereas a typical 10 Mbit/sec
thernet LAN can support 6x the throughput at 1.25 MB/sec. Use
commands such as netstat on UNIX to monitor the traffic volumes on
your connections.
- Use the DB2 Performance Configuration Wizard from the DB2
Control Center to set up the initial DB2 Database Manager and
Database Configuration parameters.
This tool will ask you a series of questions
about the nature of the workload to determine a starting set of
configuration parameter values. You can modify these parameters to
suit your production workload.
- Index your table columns appropriately.
- Ensure that columns that are joined in queries have
indexes.
- It can improve performance if columns involved in ORDER BY and
GROUP BY are indexed.
- Frequently accessed data can also be included within an index
as INCLUDED columns.
- Use the Index Advisor (also known as Index Wizard from the DB2
Control Center) to help determine a good set of indexes to use,
based on the tables and SQL statements that you use.
- Ensure that applications hold locks for as short a time as
possible.
-
- When a user operation involves multiple interactions, each
interaction should have its own transaction to commit and should
free up all locks before returning activity to the user. Keep the
duration of a transaction as short as possible by starting its
first SQL statement (which starts a transaction) as late as
possible, and its updates (inserts, updates, and deletes, which use
exclusive locks) as close to the commit stage as possible.
- Use of the DB2 registry parameter DB2_RR_TO_RS can improve
concurrency by not locking the next key of the row that was
inserted or updated. This can be used if the isolation level RR
(Repeatable Read) is not used by any programs that operate on the
same set of tables. Use DB2 Snapshot to monitor the number of
deadlocks and lock waits.
- Use stored procedures or compound SQL to minimize the network
cost.
- Minimizing the number of network trips for your SQL statements
will save on network latency and context switches, which can result
in the application holding onto locks for a shorter period of time
Generally, a stored procedure should be used when an OLTP
transaction has more than 4 or 5 statements.
- On the other hand, if there is some complicated CPU-intensive
processing involved in the application logic, leaving this in a
stored procedure running on the database server can use up
excessive CPU cycles on the database server at the expense of
database operations. In this case, either do not use stored
procedure, or execute part of the logic in the client side and
execute the rest in a stored procedure.
- Use SQL efficiently.
- In general, don't use multiple SQL statements where one will
do. When you provide more detailed search conditions by having more
predicates in a query, the optimizer has a chance to make better
choices. You should also be selective in your query so that the
database does not return more rows and columns than you need. For
example, use SQL to filter the rows you want; don't return all the
rows and then require the application to do the filtering.
- Analyze the access plan.
- Use Visual Explain or db2exfmt to analyze each SQL statement.
Make sure appropriate indexes are used to minimize the rows that
have to be fetched internally when selecting and joining
tables.
 |
Updating catalog
statistics
Background
The RUNSTATS utility updates statistics in the
system catalog tables to help with the query optimization process.
Without these statistics, the database manager could make a
decision that would adversely affect the performance of an SQL
statement. The RUNSTATS utility allows you to collect statistics on
the data contained in the tables, indexes, or both tables and
indexes. Use the RUNSTATS utility to collect statistics based on
both the table and the index data to provide accurate information
to the access plan selection process in the following
situations:
- When a table has been loaded with data, and the appropriate
indexes have been created.
- When a table has been reorganized with the REORG utility.
- When there have been extensive updates, deletions, and
insertions that affect a table and its indexes. ("Extensive" in
this case may mean that 10 to 20 percent of the table and index
data has been affected.)
- Before binding application programs whose performance is
critical.
- When you want to compare new statistics with previous
statistics. Running statistics on a periodic basis enables you to
discover performance problems at an early stage.
- When the prefetch quantity is changed.
- When you have used the REDISTRIBUTE NODEGROUP utility.
When optimizing SQL queries, the decisions made
by the SQL compiler are heavily influenced by the optimizer's model
of the database contents. This data model is used by the optimizer
to estimate the costs of alternative access paths that can be used
to resolve a particular query. A key element in the data model is
the set of statistics gathered about the data contained in the
database and stored in the system catalog tables. This includes
statistics for tables, nicknames, indexes, columns, and
user-defined functions (UDFs). A change in the data statistics can
result in a change in the choice of access plan selected as the
most efficient method of accessing the desired data.
Examples of the statistics available which help
define the data model to the optimizer include:
- The number of pages in a table and the number of pages that are
not empty.
- The degree to which rows have been moved from their original
page to other (overflow) pages.
- The number of rows in a table.
- Statistics about individual columns such as he number of
distinct values in a column.
- The degree of clustering of an index; that is, the extent to
which the physical sequence of rows in a table follows an
index.
- Statistics about the index such as the number of index levels
and the number of leaf pages in each index.
- The number of occurrences of frequently used column
values.
- The distribution of column values across the range of values
present in the column.
- Cost estimates for user-defined functions (UDFs).
RUNSTATS can help you determine how performance
is related to changes in your database. The statistics show the
data distribution within a table. When used routinely, RUNSTATS
provides data about tables and indexes over a period of time,
thereby allowing performance trends to be identified for your data
model as it evolves over time. Rebind applications that use static
SQL after using RUNSTATS so that the query optimizer can choose the
best access plan given the new statistics. However, for
applications using dynamic SQL (e.g. most vendor applications)
rebinding is not necessary since the statement will be optimized
based on the statistics at run time. When statistical information
about tables is not accurate, it may cause performance problems. In
a worst-case scenario, a particular SQL statement may cause DB2 to
use a table scan instead of an index scan.
How to update the statistics
Statistics for objects are updated in the
system catalog tables only when explicitly requested. There are
several ways to update some or all of the statistics:
- Using the RUNSTATS (run statistics) utility.
- Using LOAD, with statistics collection options specified.
- Coding SQL UPDATE statements that operate against a set of
predefined catalog views.
- Using the "reorgchk update statistics" command.
When you do not exactly know all the table
names, or there are too many, the easiest way to do RUNSTATS is to
use the "db2 reorgchk update statistics" command. The exact script
looks like this:
db2 -v connect to DB_NAME
db2 -v "select tbname, nleaf, nlevels, stats_time from sysibm.sysindexes"
db2 -v reorgchk update statistics on table all
db2 -v "select tbname, nleaf, nlevels, stats_time from sysibm.sysindexes"
db2 -v terminate
|
The example we chose above does not require
table names. This one command performs RUNSTATS on all tables.
Remember: Don't run the RUNSTATS utility
until after you have populated the database.
If you know the name of the table and to avoid
having large numbers of tables that may take a long time to
complete, it's preferable to do RUNSTATS on each table one at a
time. The command looks like the following:
db2 -v runstats on table TAB_NAME and indexes all
|
This will collect statistics by table and all
indexes (basic level).
Checking to see if RUNSTATS has been run
One quick way to see whether RUNSTATS has been
performed on your database is to query some system catalog tables.
For example, as shown in the script above, you can run this
command:
db2 -v "select tbname, nleaf, nlevels, stats_time from sysibm.sysindexes"
|
If RUNSTATS has not yet been run, you will see
"-1" for the nleaf and nlevels columns, and a "-" for the
stats_time column. These columns contain real numbers if RUNSTATS
has been run, and the stats_time column will contain the timestamp
when RUNSTATS ran. If you think the time shown in stats_time is too
old, it's time to do runstats again.
Monitoring and tuning
database configuration parameters
The following tips on database configuration
tuning will get you started in an OLTP environment with reasonably
good performance and at the same time enable you to avoid obvious
pitfalls. Among the configuration parameters, database manager
configuration parameters require a restart of the database manager,
and most database configuration parameters require the application
to reconnnect to the database in order to have the changes take
effect. The configuration parameters described here include:
Buffer Pool
Size
Background
A buffer pool is an area of storage in memory
into which database pages (containing table rows or index entries)
are temporarily read and changed. The purpose of the buffer pool is
to improve database system performance. Data can be accessed much
faster from memory than from a disk. Therefore, the fewer times the
database manager needs to read from or write to a disk, the better
the performance. The configuration of one or more buffer pools is
the single most important tuning area, since it is here that most
of the data manipulation takes place for applications connected to
the database (excluding large objects and long field data).
By default, applications use the buffer pool
called IBMDEFAULTBP, which is created when the database is created.
The DB2 database configuration parameter BUFFPAGE controls the size
of a buffer pool when the value of NPAGES is -1 for that buffer
pool in the SYSCAT.BUFFERPOOLS catalog table. Otherwise the
BUFFPAGE parameter is ignored, and the buffer pool is created with
the number of pages specified by the NPAGES parameter.
Recommendations
For applications that only use one buffer pool,
change NPAGES to -1 so that BUFFPAGE controls the size of the
buffer pool. This makes it easier to update and report the buffer
pool size along with other DB2 database configuration
parameters.
After making sure that you can use the BUFFPAGE
parameter in the database configuration to control the buffer pool
size, set it to a proper value. Setting it to a reasonably large
value is a safe thing based on the size of your database and the
nature of your application. Usually, the default value of this
parameter is very small and may not be satisfactory. Consider the
following:
- As a starting point, and if you have enough memory on you your
machine, set BUFFPAGE to 40,000 pages (160 MB), or 10% of the total
memory on your machine.
- For for a large OLTP database, set aside as much as memory as
possible for the buffer pool while keeping the system stable.As a
starting point, try 1.6 GB and then experiment with more.
How to change the parameter
Run the following script to:
- Verify the catalog value
- Enable the use of the database configuration parameter
BUFFPAGE
- Update the value of BUFFPAGE for all databases.
db2 -v connect to DB_NAME
db2 -v select * from syscat.bufferpools
db2 -v alter bufferpool IBMDEFAULTBP size -1
db2 -v connect reset
db2 -v update db cfg for dbname using BUFFPAGE bigger_value
db2 -v terminate
|
Investigative steps
To determine whether the BUFFPAGE parameter is
in use for buffer pool size of database, run:
db2 -v connect to DB_NAME
db2 -v SELECT * from SYSCAT.BUFFERPOOLS
db2 -v connect reset
db2 -v terminate
|
Examine the results. If each buffer pool has an
NPAGES value of -1, then the buffer pool size is being controlled
through the BUFFPAGE parameter in the database configuration.
To determine whether the database buffer pool
size is big enough, collect snapshots for the database and/or
buffer pool while running the application. A script similar to the
following will give you the needed information:
db2 -v update monitor switches using bufferpool on
db2 -v get monitor switches db2 -v reset monitor all
-- run your application --
db2 -v get snapshot for all databases > snap.out
db2 -v get snapshot for dbm >> snap.out
db2 -v get snapshot for all bufferpools >> snap.out
db2 -v reset monitor all db2 -v terminate
|
Make sure that you issue the "db2 -v get
snapshot" before you lose your database connection. When the last
application disconnects from the database, the database terminates
and all snapshot statistics will be lost. To ensure there is always
a connection that keeps the database up, use one of the following
methods:
- Maintain one separate connection in the window where you are
collecting snapshots.
- Use the DB2 ACTIVATE DATABASE command.
In the snapshot output, either from the
database snapshot or buffer pool snapshot, look for the following
"logical reads" and "physical reads" so that you can calculate the
buffer pool hit ratio, which can help you tune your buffer
pools:
-- Related lines from a sample of bufferpool snapshots --
Buffer pool data logical reads = 702033
Buffer pool data physical reads = 0
Buffer pool data writes = 414
Buffer pool index logical reads = 168255
Buffer pool index physical reads = 0
|
The buffer pool hit ratio indicates the
percentage of time that the database manager did not need to load a
page from disk in order to service a page request; that is, the
page was already in the buffer pool. The greater the buffer pool
hit ratio, the lower the frequency of disk I/O. Calculate the
buffer pool hit ratio as follows:
(1 - (
(buffer pool data physical reads + buffer pool index physical reads)
/
(buffer pool data logical reads + pool index logical reads)
)
)
* 100%
|
This calculation takes into account all of the
pages (index and data) that are cached by the buffer pool. Ideally
this ratio should be over 95%, and as close to 100% as possible. To
increase the buffer pool hit ration, try the following:
- Increase the buffer pool size.
- Consider allocating multiple buffer pools, possibly one for
each frequently-accessed large table with its own table space, and
one for a group of small tables, and then experiment with different
sizes of buffer pools to see which combination provides the best
performance.
Avoid overallocating memory to buffer pools if
the memory allocated cannot help performance. The buffer pool sizes
should be determined based on snapshot information taken from the
test environment.
Log buffer size
(LOGBUFSZ)
Background
LOGBUFSZ is a database configuration parameter.
It is the parameter for the log buffer. It allows you to specify
the amount of database shared memory to use as a buffer for log
records before writing these records to disk. The log records are
written to disk when one of the following events occurs:
- A transaction commits.
- The log buffer is full.
- As a result of some other internal database manager event.
Buffering the log records results in more
efficient log file I/O, because the log records are written to disk
less frequently and more log records are written each time.
Increase the size of this buffer area if there is considerable read
activity on a dedicated log disk, or if there is high disk
utilization. When increasing the value of this parameter, consider
the DBHEAP parameter, too, because the log buffer area uses space
controlled by the DBHEAP parameter.
How to change the parameter
We discovered that the default value for this
parameter, 8 (4KB pages), usually is not big enough for an OLTP
database. The optimal value for LOGBUFSZ is 128, or 256 4KB pages.
For example, you can use the command below to change it:
db2 -v update database cfg for DB_NAME using LOGBUFSZ 256
db2 -v terminate
|
Investigative steps
Use the database snapshot to determine whether
the LOGBUFSZ parameter is optimal or not by looking at the lines
shown in the following example:
Log pages read = 0
Log pages written = 12644
|
In general, the ratio between "log pages read"
and "log pages written" should be as small as possible. An ideal
value would be zero log pages read while seeing a good number of
log pages written. When there are too many log pages read, it means
a bigger LOGBUFSZ is needed.
Application heap size (APPHEAPSZ)
Background
APPHEAPSZ is a database configuration parameter
that definesthe number of private memory pages available to be used
by the database manager on behalf of a specific agent or subagent.
The heap is allocated when an agent or subagent is initialized for
an application. The amount allocated is the minimum amount needed
to process the request given to the agent or subagent. As the agent
or subagent requires more heap space to process larger SQL
statements, the database manager will allocate memory as needed, up
to the maximum specified by this parameter.
How to change the parameter
Here is the command to change the default value
(128 4KB pages for DB2 EE or 64 4KB pages for DB2 EEE) to the
optimal value:
db2 -v update db cfg for DB_NAME using applheapsz 256
db2 -v terminate
|
Investigative steps
When your applications receive an error
indicating that there is not enough storage in the application
heap, increase the value of APPHEAPSZ.
Sort heap size
(SORTHEAP) and sort heap threshold (SHEAPTHRES)
Background
SORTHEAP is a database configuration parameter
that defines the maximum number of private memory pages to be used
for private sorts, or the maximum number of shared memory pages to
be used for shared sorts. If the sort is a private sort, then this
parameter affects agent private memory. If the sort is a shared
sort, then this parameter affects the database shared memory. Each
sort has a separate sort heap that is allocated as needed, by the
database manager. This sort heap is the area where data is sorted.
If directed by the optimizer, a smaller sort heap than the one
specified by this parameter is allocated using information provided
by the optimizer.
SHEAPTHRES is a database manager configuration
parameter. Private and shared sorts use memory from two different
memory sources. The size of the shared sort memory area is
statically predetermined at the time of the first connection to a
database based on the value of SHEAPTHRES. The size of the private
sort memory area is unrestricted. The SHEAPTHRES parameter is
applied differently for private and shared sorts:
- For private sorts, SHEAPTHRES is an instance-wide soft limit on
the total amount of memory that can be consumed by private sorts at
any given time. When the total private-sort memory consumption for
an instance reaches this limit, the memory allocated for additional
incoming private-sort requests is considerably reduced.
- For shared sorts, SHEAPTHRES is a database-wide hard limit on
the total amount of memory consumed by shared sorts at any given
time. When this limit is reached, no further shared-sort memory
requests are allowed until the total shared-sort memory consumption
falls below the limit specified by SHEAPTHRES.
Examples of operations that use the sort heap
include hash joins and operations where the table is in memory.
Explicit definition of the threshold prevents the database manager
from using excessive amounts of memory for large numbers of
sorts.
Recommendations
- Use the database system monitor to track sort activity.
- Use appropriate indexes to minimize the use of the sort
heap.
- When frequent large sorts are required, increase the value of
SORTHEAP.
- If you increase SORTHEAP, determine whether the SHEAPTHRES
parameter in the database manager configuration file also needs to
be adjusted.
- The sort heap size is used by the optimizer in determining
access paths. Consider rebinding applications (using the REBIND
PACKAGE command) after changing this parameter.
- Ideally, you should set the sort heap threshold (SHEAPTHRES)
parameter to a reasonable multiple of the largest SORTHEAP
parameter you have in your database manager instance. This
parameter should be at least two times the largest SORTHEAP defined
for any database within the instance.
How to change the parameters
To change the values of SORTHEAP and
SHEAPTHRES, run the following commands:
-- SORTHEAP should be changed for individual database --
db2 -v update db cfg for DB_NAME using SORTHEAP a_value
-- SHEAPTHRES is a database manager parameter --
db2 -v update dbm cfg using SHEAPTHRES b_value
db2 -v terminate
|
Investigative steps
OLTP applications should not be performing
large sorts. They are too costly in terms of CPU and I/O resource.
Usually, the default value for SORTHEAP size (256 4KB pages) is
adequate. In fact, for high concurrency OLTP, you may want to
decrease this value from the default. When further investigation is
needed, you can issue the following command:
db2 -v update monitor switches using sort on
|
Then, let your application run for a while, and
type:
db2 -v get snapshot for database on DBNAME
|
Look at the output in the following
example:
Total sort heap allocated = 0
Total sorts = 1
Total sort time (ms) = 0
Sort overflows = 0
Active sorts = 0
Commit statements attempted = 1
Rollback statements attempted = 0
Dynamic statements attempted = 4
Static statements attempted = 1
Binds/precompiles attempted = 0
|
From this, you can calculate the number of
sorts per transaction and the percentage of sorts that overflowed
the memory that was available to them.
SortsPerTransaction =
(Total Sorts)
/
(Commit statements attempted + Rollback statements attempted)
PercentSortOverflow = (Sort overflows * 100 ) / (Total sorts)
|
Rule of thumb: If SortsPerTransaction is
greater than 5, it might indicate there are too many sorts per
transaction. If PercentSortOverflow is greater than 3 percent,
there may be serious and unexpected large sorts occurring. When
this happens, increasing SORTHEAP just hides the performance
problem--it does not fix it. The true solution to this problem is
to improve the access plan for problematic SQL statements by adding
the correct indexes.
Number of agents
(MAXAGENTS, NUM_POOLAGENTS and NUM_INITAGENTS)
Background
These are database manager configuration
parameters.
Recommendations
In most cases, set MAXAGENTS and NUM_POOLAGENTS
to a value that slightly exceeds the maximum expected number of
concurrent application connections.
Leaving NUM_INITAGENTS as the default should be
fine.
How to change the parameter
In order to change these parameters, run the
following commands:
db2 -v update dbm cfg using MAXAGENTS a_value
db2 -v update dbm cfg using NUM_POOLAGENTS b_value
db2 -v update dbm cfg using NUM_INITAGENTS c_value
db2 -v terminate
|
Investigative steps
Anytime during a run, you can use the following
command to get the snapshot data for database manager:
db2 -v get snapshot for database manager
|
and look for the following lines of output:
High water mark for agents registered = 4
High water mark for agents waiting for a token = 0
Agents registered = 4
Agents waiting for a token = 0
Idle agents = 0
Agents assigned from pool = 5
Agents created from empty pool = 4
Agents stolen from another application = 0
High water mark for coordinating agents = 4
Max agents overflow = 0
|
If you find that either "Agents waiting for a
token" or "Agents stolen from another application" is not equal to
0, you may need to increase MAXAGENTS to allow more agents to be
available to the database manager.
Locks (LOCKLIST, MAXLOCKS and
LOCKTIMEOUT)
Background
These lock-related controls are database
configuration parameters:
- LOCKLIST indicates the amount of storage that is allocated to
the lock list. There is one lock list per database, and it contains
the locks held by all applications concurrently connected to the
database. Locking is the mechanism that the database manager uses
to control concurrent access to data in the database by multiple
applications. Both rows and tables can be locked. Each lock
requires 32 or 64 bytes of the lock list, depending on whether or
not other locks are held on the object:
- 64 bytes are required to hold a lock on an object that has no
other locks held on it.
- 32 bytes are required to record a lock on an object that has an
existing lock held on it.
- MAXLOCKS defines a percentage of the lock list held by an
application that must be filled before the database manager
performs lock escalation. When the percentage of the lock list used
by one application reaches MAXLOCKS, the database manager escalates
the locks, which means it replaces row locks with table locks,
thereby reducing the number of locks in the list. When the number
of locks held by any one application reaches this percentage of the
total lock list size, lock escalation occurs for the locks held by
that application. Lock escalation also occurs if the lock list runs
out of space. The database manager determines which locks to
escalate by looking through the lock list for the application and
finding the table with the most row locks. If after replacing these
with a single table lock, the MAXLOCKS value is no longer exceeded,
lock escalation stops. If not, lock escalation continues until the
percentage of the lock list held is below the value of MAXLOCKS.
The MAXLOCKS parameter multiplied by the MAXAPPLS parameter cannot
be less than 100.
Although the escalation process itself does not
take much time, locking entire tables (versus individual rows)
decreases concurrency, and overall database performance may
decrease for subsequent accesses against the affected tables.
Suggestions for controlling the size of the
lock list include:
- Commit frequently to release locks.
- When performing many updates, lock the entire table for the
duration of the transaction before updating (using the SQL LOCK
TABLE statement). This uses only one lock and keeps others from
interfering with the updates, but it does reduce concurrency of the
data to other users.
- Use the LOCKSIZE parameter of the ALTER TABLE statement to
control how locking is done for a specific table on a permanent
basis.
- Examine the isolation level used for the application. Using the
Repeatable Read isolation level may result in an automatic table
lock in some cases. Use the Cursor Stability isolation level when
possible to decrease the number of share locks held. If application
integrity requirements are not compromised, use Uncommitted Read
instead of Cursor Stability to further decrease the amount of
locking.
Use the following steps to determinethe number
of pages required for your lock list:
- Calculate a lower bound for the size of your lock list: (512 *
32 * MAXAPPLS) / 4096, where 512 is an estimate of the average
number of locks per application and 32 is the number of bytes
required for each lock against an object that has an existing
lock.
- Calculate an upper bound for the size of your lock list: (512 *
64 * MAXAPPLS) / 4096, where 64 is the number of bytes required for
the first lock against an object.
- Estimate the amount of concurrency you will have against your
data and, based on your expectations, choose an initial value for
lock list that falls between the upper and lower bounds that you
have calculated.
Use the database system monitor to tune the
MAXLOCKS valuer.
When setting MAXLOCKS, consider the size of the
lock list (LOCKLIST):
MAXLOCKS = 100 * (512 locks per application *
32 bytes per lock * 2) / (LOCKLIST * 4096 bytes)
This sample formula allows any application to
hold twice the average number of locks. You can increase MAXLOCKS
if only a few applications run concurrently, because there will not
be a lot of contention for the lock list space under these
conditions.
LOCKTIMEOUT specifies the number of seconds that an application
will wait to obtain a lock. This helps avoid global deadlocks for
applications.
If you set this parameter to 0, the application will not wait
for locks. In this situation, if no lock is available at the time
of the request, the application immediately receives a -911.
If you set this parameter to -1, lock timeout detection is
turned off. In this situation, the application will wait for a lock
(if one is not available at the time of the request) until either
the lock is granted or until a deadlock occurs.
Recommendation
Set LOCKTIMEOUT to quickly detect waits that
are occurring because of an abnormal situation, such as a
transaction that is stalled (possibly as a result of a user leaving
their workstation.) Set it high enough so that valid lock requests
do not time-out because of peak workloads, during which time there
is an increased wait for locks.
In an online transaction processing (OLTP)
environment, start with a value of 30 seconds. In a query-only
environment you could start with a higher value. In either case,
use benchmarking techniques to tune this parameter.
How to change the parameters
To change the lock parameters, run the
following commands:
db2 -v update
db cfg for DB_NAME using LOCKLIST a_number
db2 -v update db cfg for DB_NAME using MAXLOCKS b_number
db2 -v update db cfg for DB_NAME using LOCKTIMEOUT c_number
db2 -v terminate
|
Investigative steps
Once the lock list is full, performance can
degrade because lock escalation generates more table locks and
fewer row locks, thus reducing concurrency on shared objects in the
database. Additionally, there may be more deadlocks between
applications (because they are all waiting on a limited number of
table locks), which will result in transactions being rolled back.
Your application will receive an SQLCODE of -912 when the maximum
number of lock requests has been reached for the database. If lock
escalations are causing performance concerns you may need to
increase the value of LOCKLIST parameter or the MAXLOCKS parameter.
You may use the database system monitor to determine if lock
escalations are occurring, to track the number of times an
application (connection) experienced a lock timeout, or that a
database detected a timeout situation for all applications that
were connected.
- First, run the following command to turn on the DB2 monitor for
locks:
db2 -v update monitor switches using lock on
db2 -v terminate
|
- Then collect your snapshots for the database:
db2 -v get snapshot for database on DB_NAME
|
- In the snapshot output, examine the following items:
Locks held currently = 0
Lock waits = 0
Time database waited on locks (ms) = 0
Lock list memory in use (Bytes) = 504
Deadlocks detected = 0
Lock escalations = 0
Exclusive lock escalations = 0
Agents currently waiting on locks = 0
Lock Timeouts = 0
Internal rollbacks due to deadlock = 0
|
If the "Lock list memory in use (Bytes)"
exceeds 50 percent of the defined LOCKLIST size, then increase the
number of 4KB pages in the LOCKLIST database configuration
parameter. The lock escalations, lock timeouts and deadlocks will
indicate some potential problems in your system or application. The
locking problems normally indicate some fairly significant
concurrency problems in the application that should be dealt with
before the lock list parameter is increased.
Maximum number of
active applications (MAXAPPLS)
Background
MAXAPPLS is a database configuration parameter.
It specifies the maximum number of concurrent applications (both
local and remote) that can be connected to a database. Because each
application that attaches to a database requires some private
memory to be allocated, allowing a larger number of concurrent
applications will use more memory. The value of this parameter must
be equal to or greater than the sum of the connected applications,
plus the number of these same applications that may be concurrently
in the process of completing a two-phase commit or rollback.
Recommendations
To run an OLTP application, make sure that
MAXAPPLS is set to the right value (large enough but not
unnecessarily large) to accommodate the maximum concurrent
users/connections. For those applications that use connection
pooling, we suggest setting MAXAPPLS to the connection pool size
plus one or two (just in case you need to invoke command line
connection to do something at the same time).
How to change the parameter
To change the value of MAXAPPLS, run the
following command:
db2 -v update db cfg for DB_NAME using MAXAPPLS a_number
db2 -v terminate
|
Investigative steps
When an application attempts to connect to a
database, but the value of MAXAPPLS has already been reached, the
following error is returned to the application indicating that the
maximum number of applications have been connected to the
database.
SQL1040N The maximum number of applications is already connected to
the database. SQLSTATE=57030
|
Number
of asynchronous page cleaners (NUM_IOCLEANERS)
Background
NUM_IOCLEANERS is a database configuration
parameter that lets you specify the number of asynchronous page
cleaners for a database. These page cleaners write changed pages
from the buffer pool to disk before the space in the buffer pool is
required by a database agent. This allows the agents to read new
pages without having to wait for changed pages to be written out.
As a result, your application's transactions should run faster.
If you set the parameter to zero (0), no page
cleaners are started and as a result, the database agents will
perform all of the page writes from the buffer pool to disk. This
parameter can have a significant performance impact on a database
stored across many physical storage devices, because in this case
there is a greater likelihood that one of the devices will be idle.
If no page cleaners are configured, your applications may encounter
periodic "log full" conditions.
If the applications for a database consist
primarily of transactions that update data, an increase in the
number of cleaners will speed up performance. Increasing the page
cleaners will also reduce recovery time from soft failures, such as
power outages, because the contents of the database on disk will be
more up-to-date at any given time.
Here are some factors to consider when setting
the value for this parameter:
- If transactions are run against the database, set this
parameter to be between one and the number of physical storage
devices used for the database. One recommendation is to set it at
least to the number of CPUs on your system.
- Environments with high update transaction rates may require
more page cleaners to be configured.
- Environments with large buffer pools may also require more page
cleaners to be configured.
How to change the parameter
The following command can be used to set this
parameter to a new value:
db2 -v update db cfg for DB_NAME using NUM_IOCLEANERS a_number
db2 -v terminate
|
Investigative steps
Use the database system monitor to help you
tune this configuration parameter using information from the
snapshot data (or event monitor) about write activity from a buffer
pool.
When using snapshot and collecting snapshot
data for the buffer pool, monitor the following counters: :
Buffer pool data writes = 0
Asynchronous pool data page writes = 0
Buffer pool index writes = 0
Asynchronous pool index page writes = 0
LSN Gap cleaner triggers = 0
Dirty page steal cleaner triggers = 0
Dirty page threshold cleaner triggers = 0
|
How do you decide whether NUM_IOCLEANERS should be reduced or
increased?
Decrease NUM_IOCLEANERS if both of the
following conditions are true:
- "Buffer pool data writes" is approximately equal to
"Asynchronous pool data page writes."
- "Buffer pool index writes" is approximately equal to
"Asynchronous pool index page writes."
Increase NUM_IOCLEANERS if either of the
following conditions is true:
- "Buffer pool data writes" is much greater than "Asynchronous
pool data page writes."
- "Buffer pool index writes" is much greater than "Asynchronous
pool index page writes."
Dirty page steal cleaner triggers tells the
number of times a page cleaner was invoked because a synchronous
write was needed during the victim buffer replacement for the
database. For a better response time, this number should be as low
as possible. With the counters shown above, you can use the
following formula to calculate what percentage of all cleaner
invocations are represented by this element:
Dirty page steal cleaner triggers
/
(Dirty page steal cleaner triggers +
Dirty page threshold cleaner triggers +
LSN Gap cleaner triggers)
|
If this ratio is high, it may indicate that you
have too few page cleaners defined. Too few page cleaners increases
recovery time after failures
Number of
I/O servers (NUM_IOSERVERS)
Background
I/O servers are used on behalf of the database
agents to perform prefetch I/O and asynchronous I/O by utilities
such as backup and restore. This parameter, a database
configuration parameter, specifies the number of I/O servers for a
database. No more than this number of I/Os for prefetching and
utilities can be in progress for a database at any time. An I/O
server waits while an I/O operation that it initiated is in
progress. Non-prefetch I/Os are scheduled directly from the
database agents and as a result are not constrained by
NUM_IOSERVERS.
Recommendations
In an OLTP environment, use the default.
How to change the parameter
Use the following command to set NUM_IOSERVERS
to a new value:
db2 -v update db cfg for DB_NAME using NUM_IOSERVERS a_number
db2 -v terminate
|
Number of
commits to group (MINCOMMIT)
Background
MINCOMMIT is database configuration parameter
that lets you delay the writing of log records to disk until a
minimum number of commits have been performed. This delay can help
reduce the database manager overhead associated with writing log
records. This can mean improved performance when you have multiple
applications running against a database and many commits are
requested by the applications within a very short time frame. This
grouping of commits will only occur when the value of this
parameter is greater than one and when the number of applications
connected to the database is greater than, or equal to, the value
of this parameter. When commit grouping is being performed,
application commit requests are held until either one second has
elapsed or the number of commit requests equals the value of this
parameter.
Recommendations
The default value for MINCOMMIT is 1. Increase
this parameter from its default value if multiple read/write
applications typically request concurrent database commits. This
will result in more efficient logging file I/O because it will
occur less frequently and write more log records each time it does
occur. If you believe the default value is not adequate, then it is
recommended that you start with 3, and move it up and down to see
the performance impact on your workload. You could also sample the
number of transactions per second and adjust this parameter to
accommodate the peak number of transactions per second (or some
large percentage of it). Accommodating peak activity minimizes the
overhead of writing log records during heavy load periods.
If you increase MINCOMMIT, you may also need to
increase the LOGBUFSZ parameter to avoid having a full log buffer
force a write during these heavy load periods. In this case, the
LOGBUFSZ should be equal to:
MINCOMMIT * (log space used, on average, by a transaction)
|
Here is how to use the database system monitor
to help you tune this parameter in the following ways:
- Calculating the peak number of transactions per second:
By taking monitor samples throughout a typical day, you can
determine your heavy load periods. One way to accomplish this is as
follows:
1. At the beginning of your measurements, issue
the following command:
db2 -v reset monitor for database db_name
|
(This will not reset the counters for high
water marks.)
2. At the end of your measurements, issue the
following command:
db2 -v get snapshot for database on db_name
|
3. Use the following output to calculate the
peak number of transactions:
Last reset timestamp = 06-12-2001 14:51:43.786876
Snapshot timestamp = 06-12-2001 14:56:27.787088
Commit statements attempted = 1011
Rollback statements attempted = 10
Log space used by the database (Bytes) = 3990
|
Let totalTransactions be the sum of
"commit statements attempted" and "rollback statements
attempted."
Let totalElapsedTime (in seconds) be the
difference between "Last reset timestamp" and "Snapshot timestamp".
Calculate the number of transactions per second as
NumOfTransPerSecond = totalTransactions / totalElapsedTime
|
- Calculating the log space used per transaction:
In a similar manner, by using sampling
techniques over a period of time and a number of transactions, you
can calculate an average of the log space used with the following
monitor element: log_space_used (unit of work log space
used).
1. Reset the monitor for the database of
interest at the beginning of the measurements using the
command:
db2 -v reset monitor for database db_name.
|
2. Take the snapshots at the end of the
measurements using the command:
db2 -v get snapshot for database on db2_name.
|
-
3. Output like that shown above is
produced.
4. Calculate the log space used per transaction
can be calculated using the following formula:
LogSpaceUsedPerTrans = log_space_used / totalTransactions
|
How to change the parameter
Use the following command to change the
MINCOMMIT value:
db2 -v update db cfg for DB_NAME using MINCOMMIT a_number
db2 -v terminate
|
Summary
This paper describes a number of DB2
performance fundamentals, tuning tips and techniques, and major DB2
configuration parameters that can affect OLTP performance. By
following some of the simple steps described here, you can set up,
monitor, and tune your DB2 database system.We hope that the
guidance offered in this paper will help you achieve the goal of
optimizing performance for your DB2 applications.
Resources
About the authors  | |  | Yongli An is a DB2 UDB Performance Engineer and IBM Certified Solutions Expert in DB2 UDB. Yongli is experienced in TPC-C benchmarking, and his current focus is DB2 performance for WebSphere Advanced Server and e-business applications. |
 | |  | Peter Shum is a DB2 e-business Performance Manager. Peter is responsible for DB2 performance for TPC-C, TPC-W, WebSphere Advanced Server, and CommerceSuite. Prior to doing performance work, he was in DB2 development responsible for the TP Monitor support and the Distribution Relational Database Architecture. |
Rate this page
|