Teradata/IBM partnership forges ahead for data integration optimization.
by Lori Janies
In the past, data integration often meant using standard extract, transform and load (ETL) or extract, load and transform (ELT) tools to get
the day's business transactions into the data warehouse.
Thought leaders at many companies now understand that optimal data integration means leveraging all valuable data from disparate systems
across the enterprise. By offering a complete and holistic view of enterprise information, optimized data integration improves analysis
capabilities at every level, enables executives to pinpoint effective strategies and extends an organization's competitive edge.
IBM InfoSphere DataStage Balanced Optimization, a new extension of the DataStage product, combines the immense data processing power of the
Teradata system with the hybrid ETL and ELT capabilities from IBM to create a solution that is raising the bar for data integration
technologies. Acquiring such powerful data integration capabilities can mean an enormous paradigm shift for companies that are used to
relying on a traditional ETL solution.
For example, a firm might have a batch load window of only six hours per night to transform and move massive amounts of data between
systems. With a traditional ETL solution, that entire time might be needed to merely run the batch scripts necessary to synchronize the
company's financial systems, upload the day's business transactions and generate standardized reports, leaving no time for extensive data
transformation or granular information processing.
But with a hybrid solution like InfoSphere DataStage Balanced Optimization, the firm's batch processing tasks are distributed between the
IBM system and the Teradata Database to leverage the additional available processing power from the Teradata system during off-peak
periods. This balanced optimization of ELT and ETL tasks enables the firm to increase its efficiency and data processing capabilities to
previously unattainable levels.
The highly scalable Teradata load and unload utilities empower the system to not only handle the synchronization and reporting of the
company's financial transactions, but also extract inventory, orders and sales data from the operational systems and then aggregate, sort and
filter the information. With this influx of new data, the firm's executives can readily expand their insights to include trend analysis,
cross-marketing opportunities and strategic issues to drive better, more informed decisions. Additionally, marketing analysts, customer
service representatives and employees across the enterprise can gain unprecedented access to current, customized data.
System functionality and architecture
The hybrid ETL and ELT functionality developed in InfoSphere DataStage Balanced Optimization enables the tool components to fully harness the
excess capacity and computing power available in Teradata systems during nonpeak periods for pure ELT and ETL integration. This enables
companies to load larger and more complete volumes of data into the Teradata system in a shorter period of time.
Integration is taken to the next level through unparalleled transform, extract, transform, load and transform (TETLT) functionality, unique
in the industry, in which data is processed in the source database and extracted to DataStage. DataStage's high-performance parallel engine
then executes the transformations and loads them into a target database where it performs additional transformations. Furthermore, a
graphical user interface allows even non-IT users easy control and monitoring of ETL, ELT and TETLT functionality.
By balancing the workload of complex or large data integration tasks via ETL/ELT or TETLT, companies can deliver more accurate and up-to-date
information in its entirety to users, processes or applications and leverage extra capacity in their Teradata systems. Data optimizations
available include:
|
Processing (transformations, modifications and filtering)
|
|
Aggregation
|
|
Joins, lookups and merges
|
|
Bulk I/O operations, including loading, unloading and exporting
|
|
Temporary staging tables
|
Solution benefits
Companies looking to increase their data integration capabilities can receive numerous benefits through IBM InfoSphere DataStage Balanced
Optimization, such as:
|
Improved system performance through minimized I/O. Data processing can be accomplished in either the source database or
Teradata system to minimize data-copying requirements, with data pipelined in parallel between systems without touching the disk
|
|
Comprehensive source and target support for a virtually unlimited number of heterogeneous data sources
|
|
Optimized file and queue processing, empowering the system to handle large numbers of small files or huge files that
ordinarily would not all fit into memory
|
|
Scalable platform, enabling organizations to create flexible, optimized data integration architectures to increase the
efficiency of their developers
|
|
Consistent data processing functionality that is applicable to all industries
|
InfoSphere DataStage Balanced Optimization offers customers running InfoSphere DataStage and using a Teradata system an unparalleled data
integration functionality that draws from both best-of-breed systems.
For companies that are bound by the restricted processing power of a traditional ETL/ELT solution, implementing the Teradata and InfoSphere
DataStage Balanced Optimization solution could create countless opportunities for data processing, integration and analysis that were never
before possible. T
|
PRODUCTS
|
IBM InfoSphere DataStage (ETL), InfoSphere DataStage Balanced Optimization (ELT, TELT, TETLT)
|
|
ATTRIBUTES
|
IBM InfoSphere DataStage integrates data across multiple and high-volume data sources and target applications on
demand with a high-performance parallel framework, extended metadata management and enterprise connectivity.
|
|
BENEFITS
|
IBM InfoSphere DataStage provides complete connectivity between any data source and application, ensuring the most
relevant, complete and trusted information is integrated and used to achieve new levels of business transformation.
|
|
COMPANY
|
IBM delivers market-leading solutions, allowing customers to achieve new levels of innovation through best-of-breed
information integration and master data management capabilities.
|
|
Lori Janies is a senior-level business and technology writer based in Minneapolis.
Teradata Magazine-December 2008
|