|

FAQs
Q1: How do you
calculate the average of timestamp values in Teradata? For example,
Create table
avgts(i int, ts timestamp(0));
ins avgts(1, '2003-09-01 12:00:00');
ins avgts(1, '2003-09-03 12:00:00');
select i, avg(ts) from avgts group by 1;
gives me the error:
5407: Invalid operation on ANSI Datetime or Interval value
A1:
You can do conversions like this:
sel runname,qryname,sum(timeinsecs)/count(=)
as avgtime, avgtime/3600 as timehours
,(avgtime-(3600=timehours))/60 as timeminutes
,avgtime-((3600=timehours)+60=timeminutes) as timeseconds
,trim(timehours)||':'||trim(timeminutes)||':'||trim(timeseconds)
as averagetime
from results
group by 1,2
order by 1,2
Or for a much simpler expression for what you
are doing, check into ANSIINTERVAL!
select avg((cast(timestampfield as time) - time '00:00:00.000000')
hour to second)
from table
Q2: Will the optimizer
materialize a multi-table view before any additional table joins
or constraints are made? Here is an example:
I have an aggregated
view that is a join of tables A, B, C and D. When users query
this view, they sometimes want to join the view to table D to
further constrain on column Z.
Will the optimizer
join tables A, B, C and D first without the extra table D constraint
on column Z? Or will the optimizer recognize the additional column
Z constraint before materializing the view?
Would the plan be different on Teradata
Database V2R5 compared to Teradata Database V2R4?
A2:
The case is a bit abstract here, but it makes me wonder if the
intent of the query will actually be realized.
You state that the view specifies an aggregation
over the result of a join between tables A, B, C and D, but you
do not state whether Z is in the group list and made visible via
the view. I am guessing that it isn't; if it were, then
it would be easy to apply the condition to Z without an additional
join to D.
If Z is not visible in the view, then it is
highly likely that the additional join to D constrained on Z will
not create the desired result. The aggregation will be performed
first as specified, then the additional join to the constrained
version of D will be performed—it has to be executed this
way to get the correct SQL result for the query. This is likely
to be a very different result than if the constraint was applied
to Z on D prior to the aggregation because rows will be eliminated
prior to consideration for grouping.
Q3: With the arrival
of Teradata Database V2R5 comes the long-awaited multi-value compression
facility. Can you discuss its features and its pros and cons?
A3:
The use of multi-value compression within Teradata Database V2R5
can provide some big savings, both in disk space and in the general
performance of queries.
As an example, one customer had a table in
Teradata Database V2R4.1, which used single-value compression,
and they saved 30% in disk space. Since an upgrade to Teradata
Database V2R5, the customer has managed to save an additional
25% in disk space and has noticed a general performance improvement
of around 40%, which the customer thinks had a lot to do with
the fragmentation of the table as well. The use of compression
and the additional CPU overhead was negligible, estimated around
0.5%.
To date, customers who have implemented multi-value
compression under Teradata Database V2R5 are seeing favorable
results all around.
All in all, however, users have noticed a "general
performance improvement."
Q4: We would like
to be able to store the results of any and all queries (the actual
answer sets) for subsequent auditing. Our fear is that just having
the query is not enough since rerunning the query might not produce
the same answer set the user originally got.
It seems to me
it would have to be part of the database query log, with an option
provided to store, say, up to 10K of the final spool before it's
sent to the user. But that option is currently not available.
Any thoughts as
to how we could store the results of any and all queries (or some
portion thereof) for some period of time?
A4:
Run every query as a CREATE TABLE AS WITH DATA with ROWNUMBER
as the first column in the select. Then select from that table
to return the result to the user. This process requires a driver
program to grab the SQL and rewrite it in order to do the create
with an appropriate table name that can be tracked later (date,
user, etc.). The driver should probably log on with a driver ID
so the user can't delete/drop the table.
It would not be practical to put in a database
query log since every result has a different table layout. WARNING:
The log will be vastly larger than the data, so be sure to have
enough space available to support this query.
Q5: In using the
Teradata Priority Scheduler Facility, the milestone type parameter
is set at the performance group, not at the allocation group.
Therefore, assuming no manual intervention through the PM/API,
the query will be assigned to the allocation group associated
with the performance group at logon and then be handled according
to the milestone type rules defined.
One possible exception
to this case might be a situation where you implement a process
that switches your PSF schemes, either based on some event or
at regular time intervals. If, in this scenario, you were to "redefine"
your performance group such that you change the milestone type
from "time" to "query," I'm not
sure what would happen to any "in-flight" queries.
I'm not recommending that, but I'm guessing that it's
possible.
Does PSF continually
reevaluate the definition of performance groups and/or allocation
groups when computing the relative weighting?
A5:
You can change a performance group definition at any time, with
active work running under its control. This will cause a change
to the Teradata Priority Scheduler definition, which will cause
the system.GDO record to be rewritten. Any time the system.GDO
record is changed, all CPU accumulations (whether at the session
or the query level) across all performance groups are zeroed out.
The system.GDO record exists on each node and is the operating
system's copy of the current priority settings.
This means that if you issue any schmon command,
it will change a Teradata Priority Scheduler component. Say an
allocation group weight in another resource partition, or changes
a system level limit, that change will cause all queries under
the control of any milestone limit in any performance group to
start over in their accumulations of CPU.
Most definitely, relative weights will be recalculated
when the system.GDO record changes, as is continuously happening
under normal circumstances.
|