Register | Log in


Subscribe Now>>
Home Tech2Tech Features Viewpoints Facts & Fun Teradata.com
Tech Support
Send to Colleague

Power of the written word

Teradata Certified Master responds to readers' technical inquiries with a personal touch.

by Carrie Ballinger, senior database analyst, Teradata Certified Master

Not having a television in our home until I was 10, I grew up loving to read. Even today, I have two or three books going at any point in time. I'd like to turn that old adage "A picture is worth a thousand words" on its head. To my way of thinking, "Several well-expressed words are worth thousands of pictures."

My appreciation of the written word is one reason I enjoy putting together this column of technical questions that I have received and responses I have given. But in case you don't share my affection for the written word, I've thrown in a few graphics to help illustrate my points.

Collecting statistics on dictionary tables
Dear Carrie: I've been regularly collecting statistics on several data dictionary tables for the last six months. During this time I've noticed a drop in our platform's monthly CPU growth rate when compared against last year's CPU consumption of the same month. I've also noticed about a 15% reduction in my canary query average response times. Since it only takes two minutes to do the collection, we've made dictionary statistics collection part of our normal routine.

We've gotten many homegrown applications that hit the data dictionary tables, often once per query. I've seen up to 5,000 queries against the data dictionary tables in Database Query Log (DBQL) on some days. Because some of these queries run a long time and require a large amount of spool, they can interfere with other queries. By collecting statistics, we've been able to reduce the resources that those queries require.

I haven't seen this approach recommended anywhere—are there any trade-offs I need to be aware of?
—Statistically Speaking

Dear Speaking: If you are seeing CPU reduction and response-time improvements from collecting statistics on data dictionary tables, then you should continue this process. Application packages or requests originating from business intelligence (BI) tools often include some behind-the-scenes dictionary accesses for validation or information checks.

All of the database modules that require information from the dictionary tables (such as the parser, optimizer or logon routines) make use of express requests. These requests are highly optimized, single-AMP access queries. Therefore, collecting statistics on data dictionary tables will not benefit any of the database's internal activities; rather, the benefit is targeted to SQL submitted directly by a user or application.

Any potential benefit from collecting statistics will depend on the amount and complexity of the dictionary access activity. Using DBQL to make such an assessment worked well for you and is a good starting point for other Teradata users contemplating similar actions. One obstacle that some sites have faced, however, is that the DBC logon is required to collect against dictionary tables.

Also remember that only the hashed dictionary tables allow statistics collections. This eliminates tables such as DBC.Acctg or DBC.DatabaseSpace from consideration. (See figure 1, above.)

Not all companies find value in collecting statistics on the dictionary tables. The value will depend on the frequency and complexity of the queries that access the dictionary tables, the amount of resource they use, and how long they run on your platform.

Before you decide whether to collect these new statistics, examine the DBQL output to understand if the resources used by queries accessing the dictionary warrants collecting statistics on the dictionary tables.

Primary index (PI) and non-unique secondary index (NUSI) columns are always good candidates for statistics collection. Based on what you see in DBQL, also consider collecting statistics on important selection columns. Keep in mind that the same guidelines for statistics collection apply to data dictionary tables as to any user database tables.

Below is a list of common data dictionary tables that are candidates for statistics collection. Some Teradata sites have found that collecting statistics on these tables provided value:

  • DBC.TVM
  • DBC.DBase
  • DBC.TVFields
  • DBC.AccessRights
  • DBC.Indexes
  • DBC.Profiles
  • DBC.Owners
  • DBC.Roles
  • DBC.RoleGrants
  • DBC.UDFInfo

The following non-hashed tables in the data dictionary do not support statistics collection:

  • DBC.Acctg. Resource usage by Account/User
  • DBC.ChangedRowJournal. Down-AMP recovery journal
  • DBC.DatabaseSpace. Database and table space accounting
  • DBC.LocalSessionStatusTable. Last request status by AMP
  • DBC.LocalTransactionStatusTable. Last transaction consensus status
  • DBC.OrdSysChngTable. AMP recovery journal
  • DBC.RecoveryLockTable. Recovery session locks
  • DBC.RecoveryPJTable. Permanent journal recovery
  • DBC.SavedTransactionStatus. AMP recovery table
  • DBC.SysRcvStatJournal. Recovery, reconfig, and startup information

MaxLoadTasks setting and Utility Throttles
Dear Carrie: We're on Teradata Database V2R6.2. To support a higher number of load jobs on our platform, I've set the MaxLoadTasks parameter at greater than the default of 15.

I know that enabling a Utility Throttle to control concurrency of either FastLoad or MultiLoad jobs will override the MaxLoadTasks setting. If a Utility Throttle is enabled only to control concurrent FastExport jobs, will that cause MaxLoadTasks to be ignored as well?
—Loading up

Dear Loading: In release V2R6.2, the presence of any type of Utility Throttle—including a throttle on FastExport jobs—will prevent your MaxLoadTasks parameter from being enforced. When combined with a MaxLoadAWT setting that is greater than zero, the MaxLoadTasks parameter allows you to establish a load job limit higher than the default of 15. For FastLoad and MultiLoad combined, this limit may be increased to as high as 30.

Utility Throttles, part of Teradata Dynamic Workload Manager, offer the same maximum of 30 for FastLoad and MultiLoad jobs combined, but they allow you to set more granular limits by type of utility, day of week and time. In addition, Utility Throttles can be set up for FastExport, with an upper limit of 60; however, the actual runtime limit is 60 minus the total of all active load jobs taken in combination. (See figure 2, right.)

The diverging approaches to managing load concurrencies—MaxLoadTasks versus Utility Throttles—become even more disparate in Teradata 12.0. Once the "throttle category" is enabled, Teradata Dynamic Workload Manager becomes the sole authority on the number of concurrent load jobs allowed, even if no Utility Throttles exist. This is intended to alleviate any potential conflict between the two approaches. It also reinforces that Teradata Dynamic Workload Manager is the preferred choice for managing load utility concurrency, because it provides greater flexibility and allows users to delay utilities once a limit is reached.

Differences between DBQL and Acctg CPU
Dear Carrie: When comparing CPU consumption time recorded in DBC.Acctg rows against the amount recorded in DBQL, over the course of a day I saw a net difference of 1% to 2%. The vast majority of the requests that had a difference showed more CPU time recorded in Acctg than in DBQL. Do you have any idea why these two tables would record different CPU times for the same requests?
—CPU Sophisticate

Dear Sophisticate: Both DBQL and Acctg gather CPU statistics but in slightly different places. DBQL bases its statistics off the data collected by the individual step code at the time the step is completed; Acctg collects the same resource usage numbers as DBQL, but at a slightly later point, after some internal structures have been updated with this information. This usually includes some small percentage of additional work that DBQL does not see.

Because the CPU statistics are collected at different points, it is not surprising that Acctg reports CPU levels that are a percentage point or two higher than levels reported by DBQL.

To test how CPU statistics appear in the Acctg and DBQL tables, I ran a few comparison tests. Overall, I found most of my queries showed little, if any, difference.

While small differences appeared in Acctg, as indicated above, I found one instance where DBQL reported more CPU usage than Acctg: the full table updates showed DBQL with nearly 5% higher CPU usage. This is because the Acctg table does not capture the CPU that is consumed during the end transaction processing (EDT) step. Under normal situations, very little or no CPU is consumed in this step. However, during long-running updates, some work must be done in the EDT step that access-only queries do not require—for example, transient journal management.

To better understand any differences between the Acctg table and DBQL CPU collections, I ran 17 different types of requests. These were primarily DSS queries with some set-level updates included. CPU usage from both Acctg and DBQL was captured.

The results are shown in table 1, right. The ratio column divides DBQL CPU by Acctg CPU: If the amounts are equal, the ratio will be 1; if DBQL reports more CPU usage than Acctg, the ratio will be greater than 1; if Acctg reports the greater amount of CPU, the ratio will be less than 1. The five decimal places illustrate how small most of the differences are between the two approaches to reporting CPU usage.

The two exception cases are queries 16 and 17. Both full table updates show approximately 5% more CPU usage in DBQL. Because the amount of CPU used in the end transaction processing is included in the DBQL calculation, this approach will be more accurate for these types of requests.

Trigger causes AMP lock table to overflow
Dear Carrie: We're having trouble understanding an error we're experiencing on the following INSERT/SELECT query with a trigger:

   INSERT INTO customer SELECT * FROM customer_istg;
The trigger is a simple insert of the primary key attribute into another table. This value will be used in a downstream process.

   REPLACE TRIGGER cu_ins_trgr
   AFTER INSERT
   ON customer
   REFERENCING NEW AS n
   FOR EACH ROW
   INSERT INTO Q_Customer
   VALUES ( n.CustomerKey );
The source and target tables have the same UPI—in fact, they are exact clones. When we test with 100,000 rows, everything works fine. When we test with 1 million rows, we get the 2971 error ["The AMP lock table has overflowed"], regardless if we use SQL Assistant or BTEQ. When we disable the trigger, the million rows insert without incident. Any suggestions?
—Trigger Unhappy

Dear Unhappy: Row triggers should be used for only a few rows; statement triggers are used for many rows. The 2971 error appears because each triggered row places a row hash lock, which causes the AMPs to run out of lock resources.

Many factors can contribute to lock table overflow, and, unfortunately, there is no clear maximum to the number of locks that can be set. One lab test showed that up to 26,000 row hash locks were applied on each AMP before receiving the 2971 error.

To avoid the lock overflow problem, you can lock the table Q_Customer for write using the following locking modifier on the INSERT/SELECT action in the trigger:

   REPLACE TRIGGER cu_ins_trgr
   AFTER INSERT
   ON customer
   REFERENCING NEW AS n
   FOR EACH ROW
   LOCK Q_CUSTOMER FOR WRITE
   INSERT INTO Q_Customer
   VALUES ( n.CustomerKey );

However, just because you don't receive that error doesn't mean you're executing the trigger optimally. In addition to locking issues, row triggers execute the trigger update steps over and over, once for each row. Statement triggers execute the update to the triggered table once, applying the set of triggered rows in one step. If you use a batch approach to updating, then it's better to also use a batch approach to your triggers with this code:

   REPLACE TRIGGER cu_ins_trgr
   AFTER INSERT
   ON customer
   REFERENCING NEW TABLE AS n
   FOR EACH STATEMENT
   INSERT INTO Q_Customer SELECT CustomerKey FROM n;

To help decide which trigger to use, compare EXPLAINs of the INSERT/SELECT query with a row trigger and a statement trigger to see which provides a better plan.

Below are the tables and subsequent EXPLAINs from INSERT statements when three slightly different triggers are defined on the same table. Teradata mode was used in all examples, and no data was in the tables. This first table is the target of the INSERT statement:

   CREATE TABLE customer
      (CustomerKey INT NOT NULL PRIMARY KEY,
      moredata char(100));

The trigger inserts into this second table:

   CREATE TABLE q_customer
      (Q_CustomerKey INT NOT NULL PRIMARY KEY);

This third staging table is the source of the rows that will be inserted:

   CREATE TABLE customer_istg AS customer WITH NO DATA;

Defining a row trigger similar to the one used by "Trigger Unhappy" produced this EXPLAIN:

   REPLACE TRIGGER cu_ins_trgr
   AFTER INSERT
   ON customer
   REFERENCING NEW AS n
   FOR EACH ROW
   INSERT INTO Q_Customer
   VALUES ( n.CustomerKey );
   
   EXPLAIN INSERT INTO customer SELECT * FROM customer_istg;
   
   Explanation
   1) First, we lock a distinct PLS."pseudo table" for read on a RowHash
       to prevent global deadlock for PLS.customer_istg.
   2) Next, we lock a distinct PLS."pseudo table" for write on a RowHash
       to prevent global deadlock for PLS.customer.
   3) We lock PLS.customer_istg for read, and we lock PLS.customer
       for write.
   4) We do an all-AMPs JOIN step from PLS.customer_istg by way of an
       all-rows scan with no residual conditions, which is joined to PLS.
       customer by way of an all-rows scan with no residual conditions. 
       PLS.customer_istg and PLS.customer are joined using an exclusion
       merge join, with a join condition of ("(((PLS.customer.moredata IS
       NULL) OR (PLS.customer.moredata = PLS.customer_istg.moredata 
       )) AND ((PLS.customer_istg.moredata IS NULL) OR (PLS.customer.
       moredata = PLS.customer_istg.moredata ))) AND (PLS.customer.
       CustomerKey = PLS.customer_istg.CustomerKey)") where unknown 
       comparison will be ignored. The result goes into Spool 1 (all_amps), 
       which is built locally on the AMPs. The size of Spool 1 is estimated 
       with index join confidence to be 2 rows (250 bytes). The estimated 
       time for this step is 0.04 seconds.
   5) We do an all-AMPs RETRIEVE step from Spool 1 by way of an all-rows 
       scan into Spool 2 (all_amps), which is redistributed by the hash 
       code of (PLS.customer_istg.CustomerKey) to all AMPs. Then we 
       do a SORT to order Spool 2 by row hash. The size of Spool 2 is 
       estimated with index join confidence to be 2 rows (234 bytes). The 
       estimated time for this step is 0.02 seconds.
   6) We do an all-AMPs MERGE into PLS.customer from Spool 2 (Last 
       Use). The size is estimated with index join confidence to be 2 rows. 
       The estimated time for this step is 0.63 seconds.
   7) <BEGIN ROW TRIGGER LOOP> (< Processing one row at a time)
       We do an all-AMPs TRIGGER RETRIEVE step from Spool 1 (Last Use) 
       by way of an all-rows scan and send the qualifying rows back to
       the Dispatcher.
   8) We do an INSERT into PLS.Q_Customer. The estimated time for this
       step is 0.03 seconds. <END ROW TRIGGER LOOP> for step 7.
   9) Finally, we send out an END TRANSACTION step to all AMPs involved 
       in processing the request. -> No rows are returned to the user as the 
       result of statement 1.

This EXPLAIN also reflects a row trigger, similar to the one used above, but with a table-level write lock on the table that is triggered into:

   REPLACE TRIGGER cu_ins_trgr
   AFTER INSERT
   ON customer
   REFERENCING NEW AS n
   FOR EACH ROW
   LOCK Q_CUSTOMER FOR WRITE (< Table-level write lock)
   INSERT INTO Q_Customer
   VALUES ( n.CustomerKey );
   
   EXPLAIN INSERT INTO customer SELECT * FROM customer_istg;
   
   Explanation
   1) First, we lock a distinct PLS."pseudo table" for read on a RowHash
       to prevent global deadlock for PLS.customer_istg.
   2) Next, we lock a distinct PLS."pseudo table" for write on a RowHash
       to prevent global deadlock for PLS.Q_CUSTOMER.
   3) We lock a distinct PLS."pseudo table" for write on a RowHash to
       prevent global deadlock for PLS.customer.
   4) We lock PLS.customer_istg for read, we lock PLS.Q_CUSTOMER for
       write, and we lock PLS.customer for write.
   5) We do an all-AMPs JOIN step from PLS.customer_istg by way
       of an all-rows scan with no residual conditions, which is joined to
       PLS.customer by way of an all-rows scan with no residual conditions.
       PLS.customer_istg and PLS.customer are joined using an
       exclusion merge join, with a join condition of ("(((PLS.customer.
       moredata IS NULL) OR (PLS.customer.moredata = PLS.customer_istg.
       moredata )) AND ((PLS.customer_istg.moredata IS NULL) OR 
       (PLS.customer.moredata = PLS.customer_istg.moredata ))) AND 
       (PLS.customer.CustomerKey = PLS.customer_istg.CustomerKey)")
       where unknown comparison will be ignored. The result goes into
       Spool 1 (all_amps), which is built locally on the AMPs. The size of
       Spool 1 is estimated with index join confidence to be 2 rows (250 
       bytes). The estimated time for this step is 0.04 seconds.
   6) We do an all-AMPs RETRIEVE step from Spool 1 by way of an 
       all-rows scan into Spool 2 (all_amps), which is redistributed by 
       the hash code of (PLS.customer_istg.CustomerKey) to all AMPs. 
       Then we do a SORT to order Spool 2 by row hash. The size of 
       Spool 2 is estimated with index join confidence to be 2 rows (234 
       bytes). The estimated time for this step is 0.02 seconds.
   7) We do an all-AMPs MERGE into PLS.customer from Spool 2 (Last 
       Use). The size is estimated with index join confidence to be 2 
       rows. The estimated time for this step is 0.63 seconds.
   8) <BEGIN ROW TRIGGER LOOP> (< Eliminated hash locking problem, 
       but still processing one row at a time)
       We do an all-AMPs TRIGGER RETRIEVE step from Spool 1 (Last 
       Use) by way of an all-rows scan and send the qualifying rows 
       back to the Dispatcher.
   9) We do an INSERT into PLS.Q_Customer. The estimated time for 
       this step is 0.03 seconds.
       <END ROW TRIGGER LOOP> for step 8.
  10) Finally, we send out an END TRANSACTION step to all AMPs 
      involved in processing the request.
      -> No rows are returned to the user as the result of statement 1.

This EXPLAIN is with a trigger that is a statement trigger:

   REPLACE TRIGGER cu_ins_trgr
   AFTER INSERT
   ON customer
   REFERENCING NEW TABLE AS n
   FOR EACH STATEMENT
   INSERT INTO Q_Customer SELECT CustomerKey FROM n;

   EXPLAIN INSERT INTO customer SELECT * FROM customer_istg;
   
   Explanation
   1) First, we lock a distinct PLS."pseudo table" for read on a 
       RowHash to prevent global deadlock for PLS.customer_istg.
   2) Next, we lock a distinct PLS."pseudo table" for write on a
       RowHash to prevent global deadlock for PLS.Q_Customer.
   3) We lock a distinct PLS."pseudo table" for write on a RowHash to
       prevent global deadlock for PLS.customer.
   4) We lock PLS.customer_istg for read, we lock PLS.Q_Customer for
       write, and we lock PLS.customer for write.
   5) We do an all-AMPs JOIN step from PLS.customer_istg by way
       of an all-rows scan with no residual conditions, which is joined to 
       PLS.customer by way of an all-rows scan with no residual conditions.
       PLS.customer_istg and PLS.customer are joined using an 
       exclusion merge join, with a join condition of ("(((PLS.customer.
       moredata IS NULL) OR (PLS.customer.moredata = PLS.customer_
       istg.moredata )) AND ((PLS.customer_istg.moredata IS NULL) OR 
       (PLS.customer.moredata = PLS.customer_istg.moredata ))) AND 
       (PLS.customer.CustomerKey = PLS.customer_istg.CustomerKey)") 
       where unknown comparison will be ignored. The result goes into 
       Spool 1 (all_amps), which is built locally on the AMPs. The size of 
       Spool 1 is estimated with index join confidence to be 2 rows (250 
       bytes). The estimated time for this step is 0.04 seconds.
   6) We do an all-AMPs RETRIEVE step from Spool 1 by way of an 
       all-rows scan into Spool 2 (all_amps), which is redistributed by 
       the hash code of (PLS.customer_istg.CustomerKey) to all AMPs. 
       Then we do a SORT to order Spool 2 by row hash. The size of 
       Spool 2 is estimated with index join confidence to be 2 rows (234 
       bytes). The estimated time for this step is 0.02 seconds.
   7) We execute the following steps in parallel.
       1) We do an all-AMPs MERGE into PLS.customer from Spool 2 
           (Last Use). The size is estimated with index join confidence to 
           be 2 rows. The estimated time for this step is 0.63 seconds.
       2) We do an all-AMPs RETRIEVE step from Spool 1 (Last Use)
           by way of an all-rows scan into Spool 3 (all_amps), which is 
           redistributed by the hash code of (PLS.customer_istg.CustomerKey) to 
           all AMPs. Then we do a SORT to order Spool 3 by row hash. The 
           size of Spool 3 is estimated with index join confidence to be 2 
           rows (34 bytes). The estimated time for this step is 0.02 seconds.
   8) We do an all-AMPs MERGE into PLS.Q_Customer from Spool 3 
       (Last Use). The size is estimated with index join confidence to be 
       2 rows. The estimated time for this step is 0.72 seconds.
   9) Finally, we send out an END TRANSACTION step to all AMPs 
       involved in processing the request.
   -> No rows are returned to the user as the result of statement 1. T

Reviewing MaxLoadTask and MaxLoadAWT parameters

As of Teradata Database V2R6.1, the MaxLoadTasks field works differently based on the value of the internal field MaxLoadAWT.
> If the internal field MaxLoadAWT is ZERO (by default), the MaxLoadTasks field:
  • Has a valid range of values from 0 to 15 tasks
  • Is the concurrent job limit for FastLoad, MultiLoad and FastExport
> If the internal field MaxLoadAWT is NOT ZERO, the MaxLoadTasks field:
  • Has a valid range of values from 0 to 30 tasks
  • Is the concurrent job limit for only FastLoad and MultiLoad
  • Does not control FastExport jobs, which are always allowed to run unless the combined number of active FastLoad, MultiLoad, and FastExport jobs is 60

Teradata Magazine-September 2008

Related Links

Reference Library

Get complete access to Teradata articles and white papers specific to your area of interest by selecting a category below. Reference Library
Search our library:


Manthan

Trillium

Protegrity

Teradata.com | About Us | Contact Us | Media Kit | Subscribe | Privacy/Legal | RSS
Copyright © 2008 Teradata Corporation. All rights reserved.