Teradata Magazine Cover Teradata Magazine Online  
Register Help Password
Password:
Quick Links
Current Issue
Archives
Teradata.com
Teradata Magazine Rss Feed
ARCHIVES Search Teradata Magazine Online:  














RFID
The impact of RFID on data warehousing

Corporate performance reporting
Leveraging industry data models for financial management excellence

Tech Tip
Dynamic select statements in stored procedures

Data Quality
Wanna save money? Get rid of bad data!





Query? Got Questions about Teradata?
Send your technical inquiry to the experts in Teradata engineering. From the architects to the developers, we'll get your question to the appropriate certified Teradata professional for resolution.
teradata.query
@teradata-ncr.com

 

FAQs

Q1: How do you calculate the average of timestamp values in Teradata? For example,

Create table avgts(i int, ts timestamp(0));
ins avgts(1, '2003-09-01 12:00:00');
ins avgts(1, '2003-09-03 12:00:00');
select i, avg(ts) from avgts group by 1;
gives me the error:
5407: Invalid operation on ANSI Datetime or Interval value

A1: You can do conversions like this:

sel runname,qryname,sum(timeinsecs)/count(=) as avgtime, avgtime/3600 as timehours
,(avgtime-(3600=timehours))/60 as timeminutes
,avgtime-((3600=timehours)+60=timeminutes) as timeseconds
,trim(timehours)||':'||trim(timeminutes)||':'||trim(timeseconds) as averagetime
from results
group by 1,2
order by 1,2

Or for a much simpler expression for what you are doing, check into ANSIINTERVAL!
select avg((cast(timestampfield as time) - time '00:00:00.000000') hour to second)
from table

Q2: Will the optimizer materialize a multi-table view before any additional table joins or constraints are made? Here is an example:

I have an aggregated view that is a join of tables A, B, C and D. When users query this view, they sometimes want to join the view to table D to further constrain on column Z.

Will the optimizer join tables A, B, C and D first without the extra table D constraint on column Z? Or will the optimizer recognize the additional column Z constraint before materializing the view?

Would the plan be different on Teradata Database V2R5 compared to Teradata Database V2R4?

A2: The case is a bit abstract here, but it makes me wonder if the intent of the query will actually be realized.

You state that the view specifies an aggregation over the result of a join between tables A, B, C and D, but you do not state whether Z is in the group list and made visible via the view. I am guessing that it isn't; if it were, then it would be easy to apply the condition to Z without an additional join to D.

If Z is not visible in the view, then it is highly likely that the additional join to D constrained on Z will not create the desired result. The aggregation will be performed first as specified, then the additional join to the constrained version of D will be performed—it has to be executed this way to get the correct SQL result for the query. This is likely to be a very different result than if the constraint was applied to Z on D prior to the aggregation because rows will be eliminated prior to consideration for grouping.

Q3: With the arrival of Teradata Database V2R5 comes the long-awaited multi-value compression facility. Can you discuss its features and its pros and cons?

A3: The use of multi-value compression within Teradata Database V2R5 can provide some big savings, both in disk space and in the general performance of queries.

As an example, one customer had a table in Teradata Database V2R4.1, which used single-value compression, and they saved 30% in disk space. Since an upgrade to Teradata Database V2R5, the customer has managed to save an additional 25% in disk space and has noticed a general performance improvement of around 40%, which the customer thinks had a lot to do with the fragmentation of the table as well. The use of compression and the additional CPU overhead was negligible, estimated around 0.5%.

To date, customers who have implemented multi-value compression under Teradata Database V2R5 are seeing favorable results all around.

All in all, however, users have noticed a "general performance improvement."

Q4: We would like to be able to store the results of any and all queries (the actual answer sets) for subsequent auditing. Our fear is that just having the query is not enough since rerunning the query might not produce the same answer set the user originally got.

It seems to me it would have to be part of the database query log, with an option provided to store, say, up to 10K of the final spool before it's sent to the user. But that option is currently not available.

Any thoughts as to how we could store the results of any and all queries (or some portion thereof) for some period of time?

A4: Run every query as a CREATE TABLE AS WITH DATA with ROWNUMBER as the first column in the select. Then select from that table to return the result to the user. This process requires a driver program to grab the SQL and rewrite it in order to do the create with an appropriate table name that can be tracked later (date, user, etc.). The driver should probably log on with a driver ID so the user can't delete/drop the table.

It would not be practical to put in a database query log since every result has a different table layout. WARNING: The log will be vastly larger than the data, so be sure to have enough space available to support this query.

Q5: In using the Teradata Priority Scheduler Facility, the milestone type parameter is set at the performance group, not at the allocation group. Therefore, assuming no manual intervention through the PM/API, the query will be assigned to the allocation group associated with the performance group at logon and then be handled according to the milestone type rules defined.

One possible exception to this case might be a situation where you implement a process that switches your PSF schemes, either based on some event or at regular time intervals. If, in this scenario, you were to "redefine" your performance group such that you change the milestone type from "time" to "query," I'm not sure what would happen to any "in-flight" queries. I'm not recommending that, but I'm guessing that it's possible.

Does PSF continually reevaluate the definition of performance groups and/or allocation groups when computing the relative weighting?

A5: You can change a performance group definition at any time, with active work running under its control. This will cause a change to the Teradata Priority Scheduler definition, which will cause the system.GDO record to be rewritten. Any time the system.GDO record is changed, all CPU accumulations (whether at the session or the query level) across all performance groups are zeroed out. The system.GDO record exists on each node and is the operating system's copy of the current priority settings.

This means that if you issue any schmon command, it will change a Teradata Priority Scheduler component. Say an allocation group weight in another resource partition, or changes a system level limit, that change will cause all queries under the control of any milestone limit in any performance group to start over in their accumulations of CPU.

Most definitely, relative weights will be recalculated when the system.GDO record changes, as is continuously happening under normal circumstances.




Copyright by Teradata Corporation 2001-2007.