Data Skew on a UPI table (with HIGH volume) - How?
Teradata Teradata Discussion Forums Teradata.com Discussion Forum
Visit Teradata.com
Home       Guidelines    Member List
Welcome Guest ( Login | Register )
        


This online forum is for user-to-user discussions of Teradata products, and is not an official customer support channel for Teradata. If you require direct assistance, please contact Teradata support.


Data Skew on a UPI table (with HIGH volume) -... Expand / Collapse
Author
Message
Posted 3/31/2008 7:08:41 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: Forum Members
Last Login: 4/8/2008 3:12:17 PM
Posts: 2, Visits: 11
We've a SET table defined with a UPI that has 9.3 Million rows on a 1400 AMP, V2R6 Teradata system. Please look at the numbers below. The AMP that has the MAX rows has 3.47 times more rows than the AMP with the MIN rows. This does not make sense to me given the fact that we've a UPI and a high volume of rows in the table for an even distribution.

xTimesGreaterMaxAvg - 1.56
xTimesGreaterMaxMin - 3.47

Mx - 1393664
Avrg - 890657.07
Mn - 401920
CurPerm - 1282546176

Can someone please shed some light on this? Is there something we can look at to improve data distribution? Any help would be greatly appreciated.

PS: The table also has a PPI defined on a date field which is part of the UPI (if this matters).

Thanks,
Sayee.


--
Sayee
Post #11104
Posted 4/1/2008 8:49:26 AM
Supreme Being

Supreme BeingSupreme BeingSupreme BeingSupreme BeingSupreme BeingSupreme BeingSupreme BeingSupreme Being

Group: Forum Members
Last Login: Yesterday @ 9:37:55 AM
Posts: 487, Visits: 217
Hi Sayee,
most of the skew is due to the high number of AMPs:

There are 65536 entries within the hashmap, so the average number of entries per AMP is:
SELECT 2**16/1400 -> 46,81

Of course there's no .81 entry, so some AMPs will get 46 and some 47:
SELECT 47/46.0000 -> 1,0217 -> 2.17 percent more data even if there's a UPI.

Now the table is still small (for a 1400 AMP system), 6642 rows per AMP in average, this is probably not enough for a perfectly even distribution. Now add this small skew to the 2.17 percent...

Dieter
Post #11108
Posted 4/8/2008 2:15:27 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: Forum Members
Last Login: 4/8/2008 3:12:17 PM
Posts: 2, Visits: 11
Thank you, Dieter! That was pretty good information. I never would have thought 9MM rows was not enough volume. :-)

We've more than 1 day's data in those tables now and we're seeing much better numbers (distribution).



--
Sayee
Post #11183
« Prev Topic | Next Topic »


Reading This Topic Expand / Collapse
Active Users: 0 ( 0 guests, 0 members, 0 anonymous members )
No members currently viewing this topic.


All times are GMT -5:00, Time now is 8:06am

Powered By InstantForum.NET v4.1.4 © 2008
Execution: 0.063. 7 queries. Compression Disabled.