|
|
|
Ask the expert
Check out some of the past Ask the expert columns and industry white papers on active data warehousing.
|
Active data warehousing:
from nice to necessary
Operating an intelligent enterprise enabled by an active data warehouse is no longer an option.
|
Supplying intelligence
Supply chain management software improves operations.
|
The .NET result
One way to simplify data access for software developers is the .NET Data Provider for Teradata.
|
Understand how to best provision AMP worker tasks.
|
|
|

|
|
Managing AMP worker tasks
Getting the most out of your active data warehouse means understanding how to best provision AMP worker tasks.
by Carrie Ballinger
AMP worker tasks (AWTs) represent the worker bees within each Teradata AMP, and like all limited resources, you can use them all up. Usually this is not something you need
to worry about. For a busy system supporting traditional decision-support workloads,
a brief period of AWT exhaustion is as normal as earthquakes in Southern California and is not likely to even register
on the Richter scale.
If your site is supporting active data warehousing, however, you need to start paying a little more attention to AWT usage levels. As you’ve probably already discovered, continuous loads and tactical queries are more easily unbalanced by slight tremors from any source. Because I’ve had an increasing number of questions recently on this topic, I’ve decided to dedicate this column to describing AWTs and sharing some of the questions I’ve received
from you about them.
The care and feeding of AWTs
Dear Carrie: What exactly are AWTs?
—Ready to Learn
Dear Ready: AMP worker tasks (AWTs) are the processes inside each AMP that get the database work done. A specific number of these pre-allocated AWTs are assigned to each AMP at startup and, like taxi cabs at the airport, they wait for work to arrive, do the work and come back
for more (see figure 1).
Because of their stateless condition, AWTs respond quickly to a variety of database execution needs, including internal database software tasks such as deadlock detection,
or work originating from a user-submitted query, like scanning a table. And because there’s a fixed number running, AWTs serve
to limit the number of active processes performing database work within each AMP
at any point in time.
When a message that contains a query step is sent to an AMP, that step draws from the pool of available AWTs. All of the information and context needed to perform the database work, such as the ID of a spool file to be read, is contained within that query step. Once the step has been completed, the AWT is returned to the pool.
If all AWTs on that AMP are busy at the time the message containing the new step arrives, the message will wait in a queue until an AWT is free. When that queue reaches a specific length, no more messages will be accepted on that AMP until the queue gets worked down. When an AMP catches its breath by temporarily turning away new messages, we say the AMP is in a state of flow control. Such turned-away messages will be automatically resent by the parsing engine until they are able to be received.
|
Figure 1
|
|
AWTs are assigned to each query step to perform database work and are released when that step is completed.
|
|
|
Figure 2
|
|
The system reserves 24 out of 80 AWTs by default, leaving 56 to service any type of work.
|
|
|
Figure 3
|
|
Specifying a reserve of two AWTs for expedited messages causes three new reserve pools to be created, and reduces the pool of unassigned AWTs by six.
|
|
When work messages arrive at the AMP,
the system classifies them according to the importance of the work they do. The eight default work types are:
 | WorkNew—A step coming from the dispatcher |
 | WorkOne—First level of spawned work (e.g., receiver tasks during row redistribution) |
 | WorkTwo—Second level of spawned work |
 | WorkThree—Third level of spawned work |
 | WorkAbort—Urgent internal requests |
 | WorkSpawn—Urgent internal requests |
 | WorkNormal—Urgent internal requests |
 | WorkControl—Urgent internal requests |
Of this list, WorkNew and WorkOne support almost all user-assigned work.
AWTs are neutral; they can perform work from any of the eight work types. Because of that, two techniques have been put in place
to ensure that one or two work types don’t
use all available AWTs.
First, at any point in time, no more than 50 AWTs in the AMP can be busy processing WorkNew messages. Having this limit in place ensures that some reasonable number of AWTs will always be available for WorkOne tasks (see figure 2).
Second, small reserved pools of AWTs have been instituted for each work type. These reserve pools are logical, not physical. No AWTs are set aside specifically for WorkAbort, for example; rather, internal counters keep track of the number of AWTs in use at any point in time. The AWT resource manager makes sure that the number of unassigned AWTs never falls below the number that could support all reserves for all work types.
A Teradata Priority Scheduler option allows a new set of reserved AWTs to be defined, adding three new work types to the original eight. These new reserve pools are intended
to service expedited work, usually tactical queries or TPump jobs that have experienced response-time degradation due to AWT scarcity. Be aware that the optional reserve pools reduce the number of AWTs in the unreserved AWT pool, leaving fewer for non-expedited work (see figure 3).
Understanding your limits
Dear Carrie: Eighty AWTs per AMP seems awfully low to me. Why not a higher number?
—Always Reaching
Dear Reaching: Eighty AWTs have been shown to represent
a good point of balance between productive usage of the platform and loss of throughput caused by unexpectedly high demand and contention for CPU and I/O resources. The theory behind setting a limit on AWTs is twofold. First, it’s better for overall throughput to put the brakes on before exhausting all resources. Second, keeping all AMPs to a reasonable usage level increases parallel efficiency. Historically, 80 AWTs combined with a limit of 50 on WorkNew has proven the sweet spot for the vast majority of Teradata sites.
If you’re experiencing an AWT shortage, first look at your CPU and I/O utilization numbers. If both are fully utilized, it is not advisable to increase AWTs. Most Teradata database systems will reach 100% CPU utilization with significantly less than 50 active processes of the WorkNew type. By the time most systems are approaching the limit of 80 AWTs, they are already at maximum levels
of CPU usage. If hardware resources are exhausted, then focus on application or database optimization, reducing the work running on the system, or on hardware expansion.
Monitoring AWTs
Dear Carrie: How do I know if I’m running out of AWTs?
—Planning Ahead
Dear Planning: There’s a new parallel database extension (PDE) tool, called AWTMON, available in Teradata Database V2R6.1 that displays the AWT in-use count in a concise, user-friendly manner. It’s supported on both MP-RAS and Windows. With default settings, you have 56 unassigned AWTs per AMP that can be applied to user work. Add to that number three each for the reserve pools for WorkNew and WorkOne, both of which support user work when the unreserved pool is empty, and your total reaches 62 AWTs. If WorkNew and WorkOne in-use counts add up to 62, you have hit the limit for normal processing.
AWTMON collects data from all nodes if you request it, but the default is local collection. It allows you to filter out the output on any AMP for which the in-use count is less than the threshold you specify. AWTMON never reports on parsing engines (PEs) or on AMPs that have in-use counts of zero, which keeps the output volume down. If you’re not on V2R6.1, use the puma utility. How to interpret puma output is explained in the Orange Book “Understanding AMP Worker Tasks.”
Balancing skew
Dear Carrie: I notice that when I have a hot node, the overheated AMP has a higher number of
AWTs in use compared to other AMPs in the configuration. I’d like to abort the session using the most AWTs to get things back to normal, but how do I determine which session that would be?
—Hot Session Tracker
Dear Tracker: A Teradata session issues one request at a time. If there are parallel steps and/or spawned work such as row redistribution, more than one AWT, but usually not more than four, can be used at a time by that one structured query language (SQL) request. In other words, it’s unlikely one session is gobbling up your AWTs.
I would expect demand for AWTs across all AMPs to be pretty much the same, because when Teradata sends all-AMP query steps, the same requirements for AWTs go to all AMPs equally. It’s more likely that the skew (unbalanced demand) in the CPU and I/O processing on the hot AMP is causing its AWTs to be held longer. If one AMP is doing more work than the others for a given query step, then the other AMPs will finish their part of the work sooner, hence releasing their AWTs. This skewed AMP will tend to hold its AWTs longer, not just for the query that introduced the skew, but for other queries whose work on that AMP has been slowed down as a side effect.
Try to identify and abort the skew-initiating query first, no matter how many AWTs it’s using. This should allow the victim queries to free up their AWTs in a more timely manner across all AMPs. The apparent skew of AWT usage should resolve itself as the aftershocks
of the processing skew ease.
Playing the numbers
Dear Carrie: If I reserve AWTs for tactical queries, taking a total of six out of the general AWT pool, can’t I just increase the total number of AWTs up from 80 to 86 and have everything be just the same as it was before?
—Doing the Math
Dear Math: Although you can do that, there are a couple of things to consider before reaching for reserves. First, each additional AWT requires more memory when active, possibly up to 1MB or 2MB. If you add five new AWTs per AMP and have 10 AMPs per node, you’re imposing as much as 100MB more memory demand per node on your system. This may force you to review your memory parameters, maybe tuning down the size of the file segment (FSG) cache, which could have negative performance implications. Second, when more AWTs are active, the CPU dispatch queues are likely to increase, Priority Scheduler will have more processes to manage, context switching overhead may grow and contention objects such as file system locks may increase.
The reason I don’t make reserving
AWTs my first choice when the platform
is heavily used is because it’s a partial solution, addressing only one resource for one category of work. The Throttle option rules provide a better first choice, allowing you to manage the number of active
queries across several different groups
of users and thereby cool demand for all resources, benefiting all active work. These concurrency-control rules have proven
successful at taking Teradata Warehouse sites off the fault line for both congestion and AWT exhaustion. T
Carrie Ballinger, a Teradata Certified Master V2R5, based in El Segundo, CA, is a senior technical consultant in Teradata's Active
Data Warehouse Center of Expertise.
Carrie joined Teradata in 1988 and has focused on benchmarking, database design and performance. She has authored several performance benchmarking suites and
various Teradata Orange Books. In her
current role, Carrie interfaces between Teradata users and engineering to support tactical query implementations and
workload management.
© Teradata Magazine-June 2006
back to top
|