Teradata Certified Master responds to readers' technical inquiries with a personal touch.
by Carrie Ballinger, senior database analyst, Teradata Certified Master
Not having a television in our home until I was 10, I grew up loving to read. Even today, I have two or three books going at any point in
time. I'd like to turn that old adage "A picture is worth a thousand words" on its head. To my way of thinking, "Several well-expressed words
are worth thousands of pictures."
My appreciation of the written word is one reason I enjoy putting together this column of technical questions that I have received and
responses I have given. But in case you don't share my affection for the written word, I've thrown in a few graphics to help illustrate my
points.
Collecting statistics on dictionary tables
Dear Carrie: I've been regularly collecting statistics on several data dictionary tables for the last six
months. During this time I've noticed a drop in our platform's monthly CPU growth rate when compared against last year's CPU consumption of
the same month. I've also noticed about a 15% reduction in my canary query average response times. Since it only takes two minutes to do the
collection, we've made dictionary statistics collection part of our normal routine.
We've gotten many homegrown applications that hit the data dictionary tables, often once per query. I've seen up to 5,000 queries against the
data dictionary tables in Database Query Log (DBQL) on some days. Because some of these queries run a long time and require a large amount of
spool, they can interfere with other queries. By collecting statistics, we've been able to reduce the resources that those queries require.
I haven't seen this approach recommended anywhere—are there any trade-offs I need to be aware of?
—Statistically Speaking
Dear Speaking: If you are seeing CPU reduction and response-time improvements from collecting statistics on
data dictionary tables, then you should continue this process. Application packages or requests originating from business intelligence (BI)
tools often include some behind-the-scenes dictionary accesses for validation or information checks.
All of the database modules that require information from the dictionary tables (such as the parser, optimizer or logon routines) make use of
express requests. These requests are highly optimized, single-AMP access queries. Therefore, collecting statistics on data dictionary tables
will not benefit any of the database's internal activities; rather, the benefit is targeted to SQL submitted directly by a user or application.
Any potential benefit from collecting statistics will depend on the amount and complexity of the dictionary access activity. Using DBQL to
make such an assessment worked well for you and is a good starting point for other Teradata users contemplating similar actions. One obstacle
that some sites have faced, however, is that the DBC logon is required to collect against dictionary tables.
Also remember that only the hashed dictionary tables allow statistics collections. This eliminates tables such as DBC.Acctg or
DBC.DatabaseSpace from consideration. (See figure 1, above.)
Not all companies find value in collecting statistics on the dictionary tables. The value will depend on the frequency and complexity of the
queries that access the dictionary tables, the amount of resource they use, and how long they run on your platform.
Before you decide whether to collect these new statistics, examine the DBQL output to understand if the resources used by queries accessing
the dictionary warrants collecting statistics on the dictionary tables.
Primary index (PI) and non-unique secondary index (NUSI) columns are always good candidates for statistics collection. Based on what you see
in DBQL, also consider collecting statistics on important selection columns. Keep in mind that the same guidelines for statistics collection
apply to data dictionary tables as to any user database tables.
Below is a list of common data dictionary tables that are candidates for statistics collection. Some Teradata sites have found that collecting
statistics on these tables provided value:
- DBC.TVM
- DBC.DBase
- DBC.TVFields
- DBC.AccessRights
- DBC.Indexes
- DBC.Profiles
- DBC.Owners
- DBC.Roles
- DBC.RoleGrants
- DBC.UDFInfo
The following non-hashed tables in the data dictionary do not support statistics collection:
- DBC.Acctg. Resource usage by Account/User
- DBC.ChangedRowJournal. Down-AMP recovery journal
- DBC.DatabaseSpace. Database and table space accounting
- DBC.LocalSessionStatusTable. Last request status by AMP
- DBC.LocalTransactionStatusTable. Last transaction consensus status
- DBC.OrdSysChngTable. AMP recovery journal
- DBC.RecoveryLockTable. Recovery session locks
- DBC.RecoveryPJTable. Permanent journal recovery
- DBC.SavedTransactionStatus. AMP recovery table
- DBC.SysRcvStatJournal. Recovery, reconfig, and startup information
MaxLoadTasks setting and Utility Throttles
Dear Carrie: We're on Teradata Database V2R6.2. To support a higher number of load jobs on our platform,
I've set the MaxLoadTasks parameter at greater than the default of 15.
I know that enabling a Utility Throttle to control concurrency of either FastLoad or MultiLoad jobs will override the MaxLoadTasks setting.
If a Utility Throttle is enabled only to control concurrent FastExport jobs, will that cause MaxLoadTasks to be ignored as well?
—Loading up
Dear Loading: In release V2R6.2, the presence of any type of Utility Throttle—including a throttle on
FastExport jobs—will prevent your MaxLoadTasks parameter from being enforced. When combined with a MaxLoadAWT setting that is greater than
zero, the MaxLoadTasks parameter allows you to establish a load job limit higher than the default of 15. For FastLoad and MultiLoad combined,
this limit may be increased to as high as 30.
Utility Throttles, part of Teradata Dynamic Workload Manager, offer the same maximum of 30 for FastLoad and MultiLoad jobs combined, but they
allow you to set more granular limits by type of utility, day of week and time. In addition, Utility Throttles can be set up for FastExport,
with an upper limit of 60; however, the actual runtime limit is 60 minus the total of all active load jobs taken in combination.
(See figure 2, right.)
The diverging approaches to managing load concurrencies—MaxLoadTasks versus Utility Throttles—become even more disparate in Teradata 12.0.
Once the "throttle category" is enabled, Teradata Dynamic Workload Manager becomes the sole authority on the number of concurrent load jobs
allowed, even if no Utility Throttles exist. This is intended to alleviate any potential conflict between the two approaches. It also
reinforces that Teradata Dynamic Workload Manager is the preferred choice for managing load utility concurrency, because it provides greater
flexibility and allows users to delay utilities once a limit is reached.
Differences between DBQL and Acctg CPU
Dear Carrie: When comparing CPU consumption time recorded in DBC.Acctg rows against the amount recorded in
DBQL, over the course of a day I saw a net difference of 1% to 2%. The vast majority of the requests that had a difference showed more CPU
time recorded in Acctg than in DBQL. Do you have any idea why these two tables would record different CPU times for the same requests?
—CPU Sophisticate
Dear Sophisticate: Both DBQL and Acctg gather CPU statistics but in slightly different places. DBQL bases
its statistics off the data collected by the individual step code at the time the step is completed; Acctg collects the same resource usage
numbers as DBQL, but at a slightly later point, after some internal structures have been updated with this information. This usually includes
some small percentage of additional work that DBQL does not see.
Because the CPU statistics are collected at different points, it is not surprising that Acctg reports CPU levels that are a percentage point
or two higher than levels reported by DBQL.
To test how CPU statistics appear in the Acctg and DBQL tables, I ran a few comparison tests. Overall, I found most of my queries showed
little, if any, difference.
While small differences appeared in Acctg, as indicated above, I found one instance where DBQL reported more CPU usage than Acctg: the full
table updates showed DBQL with nearly 5% higher CPU usage. This is because the Acctg table does not capture the CPU that is consumed during
the end transaction processing (EDT) step. Under normal situations, very little or no CPU is consumed in this step. However, during
long-running updates, some work must be done in the EDT step that access-only queries do not require—for example, transient journal
management.
To better understand any differences between the Acctg table and DBQL CPU collections, I ran 17 different types of requests. These were
primarily DSS queries with some set-level updates included. CPU usage from both Acctg and DBQL was captured.
The results are shown in table 1, right. The ratio column divides DBQL CPU by Acctg CPU: If the amounts are equal, the ratio will be 1; if
DBQL reports more CPU usage than Acctg, the ratio will be greater than 1; if Acctg reports the greater amount of CPU, the ratio will be less
than 1. The five decimal places illustrate how small most of the differences are between the two approaches to reporting CPU usage.
The two exception cases are queries 16 and 17. Both full table updates show approximately 5% more CPU usage in DBQL. Because the amount of
CPU used in the end transaction processing is included in the DBQL calculation, this approach will be more accurate for these types of
requests.
Trigger causes AMP lock table to overflow
Dear Carrie: We're having trouble understanding an error we're experiencing on the following INSERT/SELECT
query with a trigger:
INSERT INTO customer SELECT * FROM customer_istg;
The trigger is a simple insert of the primary key attribute into another table. This value will be used in a downstream process.
REPLACE TRIGGER cu_ins_trgr
AFTER INSERT
ON customer
REFERENCING NEW AS n
FOR EACH ROW
INSERT INTO Q_Customer
VALUES ( n.CustomerKey );
The source and target tables have the same UPI—in fact, they are exact clones. When we test with 100,000 rows, everything works fine. When we
test with 1 million rows, we get the 2971 error ["The AMP lock table has overflowed"], regardless if we use SQL Assistant or BTEQ. When we
disable the trigger, the million rows insert without incident. Any suggestions?
—Trigger Unhappy
Dear Unhappy: Row triggers should be used for only a few rows; statement triggers are used for many rows.
The 2971 error appears because each triggered row places a row hash lock, which causes the AMPs to run out of lock resources.
Many factors can contribute to lock table overflow, and, unfortunately, there is no clear maximum to the number of locks that can be set. One
lab test showed that up to 26,000 row hash locks were applied on each AMP before receiving the 2971 error.
To avoid the lock overflow problem, you can lock the table Q_Customer for write using the following locking modifier on the INSERT/SELECT
action in the trigger:
REPLACE TRIGGER cu_ins_trgr
AFTER INSERT
ON customer
REFERENCING NEW AS n
FOR EACH ROW
LOCK Q_CUSTOMER FOR WRITE
INSERT INTO Q_Customer
VALUES ( n.CustomerKey );
However, just because you don't receive that error doesn't mean you're executing the trigger optimally. In addition to locking issues, row
triggers execute the trigger update steps over and over, once for each row. Statement triggers execute the update to the triggered table once,
applying the set of triggered rows in one step. If you use a batch approach to updating, then it's better to also use a batch approach to your
triggers with this code:
REPLACE TRIGGER cu_ins_trgr
AFTER INSERT
ON customer
REFERENCING NEW TABLE AS n
FOR EACH STATEMENT
INSERT INTO Q_Customer SELECT CustomerKey FROM n;
To help decide which trigger to use, compare EXPLAINs of the INSERT/SELECT query with a row trigger and a statement trigger to see which
provides a better plan.
Below are the tables and subsequent EXPLAINs from INSERT statements when three slightly different triggers are defined on the same table.
Teradata mode was used in all examples, and no data was in the tables. This first table is the target of the INSERT statement:
CREATE TABLE customer
(CustomerKey INT NOT NULL PRIMARY KEY,
moredata char(100));
The trigger inserts into this second table:
CREATE TABLE q_customer
(Q_CustomerKey INT NOT NULL PRIMARY KEY);
This third staging table is the source of the rows that will be inserted:
CREATE TABLE customer_istg AS customer WITH NO DATA;
Defining a row trigger similar to the one used by "Trigger Unhappy"
produced this EXPLAIN:
REPLACE TRIGGER cu_ins_trgr
AFTER INSERT
ON customer
REFERENCING NEW AS n
FOR EACH ROW
INSERT INTO Q_Customer
VALUES ( n.CustomerKey );
EXPLAIN INSERT INTO customer SELECT * FROM customer_istg;
Explanation
1) First, we lock a distinct PLS."pseudo table" for read on a RowHash
to prevent global deadlock for PLS.customer_istg.
2) Next, we lock a distinct PLS."pseudo table" for write on a RowHash
to prevent global deadlock for PLS.customer.
3) We lock PLS.customer_istg for read, and we lock PLS.customer
for write.
4) We do an all-AMPs JOIN step from PLS.customer_istg by way of an
all-rows scan with no residual conditions, which is joined to PLS.
customer by way of an all-rows scan with no residual conditions.
PLS.customer_istg and PLS.customer are joined using an exclusion
merge join, with a join condition of ("(((PLS.customer.moredata IS
NULL) OR (PLS.customer.moredata = PLS.customer_istg.moredata
)) AND ((PLS.customer_istg.moredata IS NULL) OR (PLS.customer.
moredata = PLS.customer_istg.moredata ))) AND (PLS.customer.
CustomerKey = PLS.customer_istg.CustomerKey)") where unknown
comparison will be ignored. The result goes into Spool 1 (all_amps),
which is built locally on the AMPs. The size of Spool 1 is estimated
with index join confidence to be 2 rows (250 bytes). The estimated
time for this step is 0.04 seconds.
5) We do an all-AMPs RETRIEVE step from Spool 1 by way of an all-rows
scan into Spool 2 (all_amps), which is redistributed by the hash
code of (PLS.customer_istg.CustomerKey) to all AMPs. Then we
do a SORT to order Spool 2 by row hash. The size of Spool 2 is
estimated with index join confidence to be 2 rows (234 bytes). The
estimated time for this step is 0.02 seconds.
6) We do an all-AMPs MERGE into PLS.customer from Spool 2 (Last
Use). The size is estimated with index join confidence to be 2 rows.
The estimated time for this step is 0.63 seconds.
7) <BEGIN ROW TRIGGER LOOP> (< Processing one row at a time)
We do an all-AMPs TRIGGER RETRIEVE step from Spool 1 (Last Use)
by way of an all-rows scan and send the qualifying rows back to
the Dispatcher.
8) We do an INSERT into PLS.Q_Customer. The estimated time for this
step is 0.03 seconds. <END ROW TRIGGER LOOP> for step 7.
9) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request. -> No rows are returned to the user as the
result of statement 1.
This EXPLAIN also reflects a row trigger, similar to the one used above, but with a table-level write lock on the table that is triggered
into:
REPLACE TRIGGER cu_ins_trgr
AFTER INSERT
ON customer
REFERENCING NEW AS n
FOR EACH ROW
LOCK Q_CUSTOMER FOR WRITE (< Table-level write lock)
INSERT INTO Q_Customer
VALUES ( n.CustomerKey );
EXPLAIN INSERT INTO customer SELECT * FROM customer_istg;
Explanation
1) First, we lock a distinct PLS."pseudo table" for read on a RowHash
to prevent global deadlock for PLS.customer_istg.
2) Next, we lock a distinct PLS."pseudo table" for write on a RowHash
to prevent global deadlock for PLS.Q_CUSTOMER.
3) We lock a distinct PLS."pseudo table" for write on a RowHash to
prevent global deadlock for PLS.customer.
4) We lock PLS.customer_istg for read, we lock PLS.Q_CUSTOMER for
write, and we lock PLS.customer for write.
5) We do an all-AMPs JOIN step from PLS.customer_istg by way
of an all-rows scan with no residual conditions, which is joined to
PLS.customer by way of an all-rows scan with no residual conditions.
PLS.customer_istg and PLS.customer are joined using an
exclusion merge join, with a join condition of ("(((PLS.customer.
moredata IS NULL) OR (PLS.customer.moredata = PLS.customer_istg.
moredata )) AND ((PLS.customer_istg.moredata IS NULL) OR
(PLS.customer.moredata = PLS.customer_istg.moredata ))) AND
(PLS.customer.CustomerKey = PLS.customer_istg.CustomerKey)")
where unknown comparison will be ignored. The result goes into
Spool 1 (all_amps), which is built locally on the AMPs. The size of
Spool 1 is estimated with index join confidence to be 2 rows (250
bytes). The estimated time for this step is 0.04 seconds.
6) We do an all-AMPs RETRIEVE step from Spool 1 by way of an
all-rows scan into Spool 2 (all_amps), which is redistributed by
the hash code of (PLS.customer_istg.CustomerKey) to all AMPs.
Then we do a SORT to order Spool 2 by row hash. The size of
Spool 2 is estimated with index join confidence to be 2 rows (234
bytes). The estimated time for this step is 0.02 seconds.
7) We do an all-AMPs MERGE into PLS.customer from Spool 2 (Last
Use). The size is estimated with index join confidence to be 2
rows. The estimated time for this step is 0.63 seconds.
8) <BEGIN ROW TRIGGER LOOP> (< Eliminated hash locking problem,
but still processing one row at a time)
We do an all-AMPs TRIGGER RETRIEVE step from Spool 1 (Last
Use) by way of an all-rows scan and send the qualifying rows
back to the Dispatcher.
9) We do an INSERT into PLS.Q_Customer. The estimated time for
this step is 0.03 seconds.
<END ROW TRIGGER LOOP> for step 8.
10) Finally, we send out an END TRANSACTION step to all AMPs
involved in processing the request.
-> No rows are returned to the user as the result of statement 1.
This EXPLAIN is with a trigger that is a statement trigger:
REPLACE TRIGGER cu_ins_trgr
AFTER INSERT
ON customer
REFERENCING NEW TABLE AS n
FOR EACH STATEMENT
INSERT INTO Q_Customer SELECT CustomerKey FROM n;
EXPLAIN INSERT INTO customer SELECT * FROM customer_istg;
Explanation
1) First, we lock a distinct PLS."pseudo table" for read on a
RowHash to prevent global deadlock for PLS.customer_istg.
2) Next, we lock a distinct PLS."pseudo table" for write on a
RowHash to prevent global deadlock for PLS.Q_Customer.
3) We lock a distinct PLS."pseudo table" for write on a RowHash to
prevent global deadlock for PLS.customer.
4) We lock PLS.customer_istg for read, we lock PLS.Q_Customer for
write, and we lock PLS.customer for write.
5) We do an all-AMPs JOIN step from PLS.customer_istg by way
of an all-rows scan with no residual conditions, which is joined to
PLS.customer by way of an all-rows scan with no residual conditions.
PLS.customer_istg and PLS.customer are joined using an
exclusion merge join, with a join condition of ("(((PLS.customer.
moredata IS NULL) OR (PLS.customer.moredata = PLS.customer_
istg.moredata )) AND ((PLS.customer_istg.moredata IS NULL) OR
(PLS.customer.moredata = PLS.customer_istg.moredata ))) AND
(PLS.customer.CustomerKey = PLS.customer_istg.CustomerKey)")
where unknown comparison will be ignored. The result goes into
Spool 1 (all_amps), which is built locally on the AMPs. The size of
Spool 1 is estimated with index join confidence to be 2 rows (250
bytes). The estimated time for this step is 0.04 seconds.
6) We do an all-AMPs RETRIEVE step from Spool 1 by way of an
all-rows scan into Spool 2 (all_amps), which is redistributed by
the hash code of (PLS.customer_istg.CustomerKey) to all AMPs.
Then we do a SORT to order Spool 2 by row hash. The size of
Spool 2 is estimated with index join confidence to be 2 rows (234
bytes). The estimated time for this step is 0.02 seconds.
7) We execute the following steps in parallel.
1) We do an all-AMPs MERGE into PLS.customer from Spool 2
(Last Use). The size is estimated with index join confidence to
be 2 rows. The estimated time for this step is 0.63 seconds.
2) We do an all-AMPs RETRIEVE step from Spool 1 (Last Use)
by way of an all-rows scan into Spool 3 (all_amps), which is
redistributed by the hash code of (PLS.customer_istg.CustomerKey) to
all AMPs. Then we do a SORT to order Spool 3 by row hash. The
size of Spool 3 is estimated with index join confidence to be 2
rows (34 bytes). The estimated time for this step is 0.02 seconds.
8) We do an all-AMPs MERGE into PLS.Q_Customer from Spool 3
(Last Use). The size is estimated with index join confidence to be
2 rows. The estimated time for this step is 0.72 seconds.
9) Finally, we send out an END TRANSACTION step to all AMPs
involved in processing the request.
-> No rows are returned to the user as the result of statement 1. T
| Reviewing MaxLoadTask and MaxLoadAWT parameters |
|
As of Teradata Database V2R6.1, the MaxLoadTasks field works differently based on the value of the internal field MaxLoadAWT.
| > |
If the internal field MaxLoadAWT is ZERO (by default), the MaxLoadTasks field:
- Has a valid range of values from 0 to 15 tasks
- Is the concurrent job limit for FastLoad, MultiLoad and FastExport
|
| > |
If the internal field MaxLoadAWT is NOT ZERO, the MaxLoadTasks field:
- Has a valid range of values from 0 to 30 tasks
- Is the concurrent job limit for only FastLoad and MultiLoad
- Does not control FastExport jobs, which are always allowed to run unless the combined number of active FastLoad, MultiLoad, and FastExport jobs is 60
|
|
|
Teradata Magazine-September 2008
|