Register | Log in


Subscribe Now>>
Home Tech2Tech Features Viewpoints Facts & Fun Teradata.com
Applied Solutions
Download PDF|Send to Colleague

Back to the future

Teradata expands market reach with a new data warehouse appliance.

by Chris Twogood and Ed White

Data warehousing has gone through a structural and functional evolution over the last few decades. The Teradata DBC 1012 was awarded product of the year in 1986 by Fortune magazine. It was a reliable, simple, cost-effective solution for basic decision support and reporting.

As the years passed, data warehousing became more sophisticated and focused on real-time, pervasive business intelligence (BI). To facilitate this evolution, Teradata developed the active data warehouse to handle continuous data loads as well as complex decision support queries and event processing while injecting intelligence into thousands of operational applications—all from one integrated system. Tremendous business value and competitive advantage are associated with this type of data warehousing.

Some companies are just getting started with data warehousing and still only need the basic data warehouse without a lot of extra functionality. The Teradata Data Warehouse Appliance 2550 has been introduced as an integrated solution for organizations that are just starting with BI and need traditional decision support and reporting tools.

Designed with the latest in technology and scalability, the Teradata 2550 can also be used by companies that are experienced in BI and want to get more out of their enterprise data warehouse (EDW). For instance, the platform can be used as an analytical sandbox to develop and test applications before integrating them into the EDW. It can also be implemented as a data mart outside the EDW in unique situations, such as when an organization has a department with short-term data needs that, because of compliance or geographic requirements, fall outside the scope of IT's capabilities.

Simple, powerful, cost-effective
The Teradata 2550 is built on a shared-nothing massively parallel processing (MPP) architecture with 144 AMPs to 144 disks in the single cabinet. With an AMP-to-disk ratio of 1:1, full-system utilization can be achieved with very few tasks. This capability to fully utilize resources with only minimal threads of concurrency means AMPs get proportional performance resources, file scans are faster and decision support performance is increased. This allows the system parallelism to generate enough workload to get all disk drives busy with one or a few active queries. (See figure, below.)

This implementation technology is architecturally optimized for decision support workloads that typically have fewer concurrent users who run longer, complex queries. It does not, however, support Teradata Active System Management since the Teradata 2550 is not targeted to the active mixed workload environment.

The MPP architecture enables parallel queries across the CPU, memory and storage for optimized performance, and automatically manages data placement across disks. Parallelism enables table scan operations on all database disks at raw disk transfer rates and drives key performance for fast file scans and decision support workloads. Because the Teradata 2550 uses a unique disk subsystem, the SQL query Optimizer also uses a different set of cost coefficients to evaluate and compare query plans. As a result, when processing a SQL request the Optimizer will select a query plan different from one that would be used for the same SQL request on an active data warehouse from Teradata. Consequently, the amount of time database administrators (DBAs) spend tuning the system is reduced.

Other benefits include:
Best-in-class workload management features. Users can be assigned to new groups, and the pre-configured workload settings can be set to rush, high, medium and low. If the system is not fully utilized, the CPU is automatically available to lower-priority groups, regardless of the base setting. The system also supports threshold exception actions, which enable automatic management of long-running queries.
Scalable up to 140TB. A total of 12.6TB of user data (with 30% compression) can be contained in a single cabinet, and the system can grow to 11 expansion cabinets with up to 140TB of user data. Expansion cabinets are identical to the system cabinets, with the exception of the BYNET ethernet switches.
Enterprise-class components. The system is composed of proven Intel nodes with redundant power supplies, UPS power, redundant system management, redundant BYNET with fault tolerant, heartbeat monitoring and load balancing.
System availability. Power failure protection is ensured with dual AC inputs. RAID-1 guards against disk failure, and cliques protect against node failure.

Teradata Database 12.0 compatibility
Simple and easy to use, the Teradata 2550 is delivered ready to run and can be live in a few hours. The cost-effective, fully bundled solution is pre-installed in a single power-efficient cabinet and can be connected to other applications and systems.

The platform has integrated Intel servers, enterprise-class storage and an open SUSE Linux 64-bit operating system. Optimized for traditional decision support analysis, the platform includes system management and data load tools such as the Teradata Utility Pack, Teradata Manager and Teradata Parallel Transporter Load and Export Operators.

Teradata Database 12.0 is the foundation for the entire Teradata platform family. (See bottom bar: "An extended family," below.) While the database configurations have been tuned to optimize for decision support workload, the core underlying functions remain the same. This allows companies with expanding workloads to easily migrate to more powerful systems without costly changes to load programs, data models or underlying structures.

To optimize the Teradata 2550 for traditional decision support and analytical workloads, Teradata Database 12.0 comes preset with:
Cache Threshold modified to 100% to optimize throughput for workloads and environments with partitioned primary index (PPI) tables, fewer secondary indexes and low to moderate concurrency. Because spools produced by the Teradata 2550 are expected to be larger than those produced by an active data warehouse, a fully cached threshold keeps more generated spool space in memory and reduces I/O.
File Segment Cache that has a default set to 80% and contains the most recently used database segments. When the system reads a database block, it checks the cache first. If the block is cached, the system avoids the overhead of rereading the block from disk.
Cylinder Read to enable high-performance disk transfers that are efficient for table scans and decision support workloads. The smaller the actual block size of the table, the greater the I/O benefit. In the Teradata 2550 the default number of cylinder slots per AMP is set to eight.
Perm Database Size of 254 sectors for times when Cylinder Read does not apply. In full table scan scenarios when increased throughput capacity of disk drives is desired, large data blocks are beneficial.
ReadAhead feature that is available when sequential file access workloads are running. This function provides improved read performance and faster access times by enabling the data to be pre-read, based on header lookups. With this option turned on, when the file system tries to read a data block from the file, it issues a read-ahead I/O and the next data block is brought into memory.
Pre-Fetch feature to improve read performance and query response times by pre-reading the data in cache. When the Teradata Database accesses the cylinder index, the disk controller firmware uses pre-fetching to move the entire cylinder to the controller cache. Load balancing, another enhancement made to the disk controller firmware, determines the availability of the drives and directs I/O requests to the less-busy drive in the RAID-1 pair. This results in higher throughput for sequential scan-based workloads.

Basic needs
Different organizations have different BI needs. Some companies are just starting out with a data warehouse and require only the basic system and tools. Other companies want a second, smaller data warehouse to complement their existing EDW environment. The Teradata 2550 has the capability for both purposes.

Businesses today that want to span their analytical reach throughout the enterprise have more data warehousing choices than available decades ago. With the Teradata 2550, companies can be assured of a system that implements modern-day technology but with the simplicity reminiscent of the first data warehouse environment of years past. T

An extended family

Teradata's powerful platform family addresses customers' business and technical needs. These platforms are powered by the Teradata Database 12.0 engine and can accommodate data of all levels of complexity and size, from departmental to active data warehousing.

—C.T. and E.W.

Chris Twogood is the product marketing manager for the Teradata platforms.

Ed White, director of Teradata Product Marketing, manages the global marketing of Teradata platforms, products and programs.

Teradata Magazine-September 2008

More Applied Solutions

Related Links

Reference Library

Get complete access to Teradata articles and white papers specific to your area of interest by selecting a category below. Reference Library
Search our library:


Manthan

Trillium

Protegrity

Teradata.com | About Us | Contact Us | Media Kit | Subscribe | Privacy/Legal | RSS
Copyright © 2008 Teradata Corporation. All rights reserved.