Teradata Magazine Cover Teradata Magazine Online  
Register Help Password
Password:
Quick Links
Current Issue
Archives
Teradata.com
Teradata Magazine Rss Feed
ARCHIVES Search Teradata Magazine Online:  
TERALINKS ONLINE
Featured Partners
The latest business intelligence, analytics and integration solutions from Teradata partners

GoldenGate
Real-time data warehousing: It's all about the data!

KXEN
Lifting predictive analytics' productivity.

Protegrity
Database-level encryption ensures security.

Siebel
Enabling the insight-driven enterprise.


PrintPrint

Send to colleagueSend to colleague E-mail us

Business time is real-time

GoldenGate's transactional data management (TDM) enables real-time data acquisition.

Business time is increasingly real-time. Today's businesses have to capture and respond to business events faster and more rigorously than ever in order to grow their competitive advantage. Today, the preponderance of competitive advantage comes from the effective use of information technology. The key tool of the trade is the enterprise data warehouse, coupled with an enterprise analytics framework.

Across the enterprise, each facet of the business gathers data through an assortment of business activities and delivers it to a central data repository. This repository is the enterprise data warehouse (EDW)—where data is captured, analyzed and leveraged to drive better decisions. The quality of these decisions depends not only on the level of sophistication of the analytic applications but also on the underlying data. Data has to be accurate, relevant and complete. But, most importantly, it has to be timely. Timely data ensures timely and better-informed decisions.

The lifecycle of a data record through enterprise analytics starts with a business event taking place. Data acquisition technologies deliver the event record to the data warehouse. Analytical processing helps turn the data into information, and a business decision leads to a corresponding action. To approach real time, the duration between the event and its consequent action needs to be minimized. Typically, it is the data acquisition process that introduces the majority of the latency.

Data acquisition approaches for the real-time enterprise
There are numerous technologies that serve the data acquisition needs of the data warehouse; however, only a few offer real-time data delivery. The selection criteria should focus on important considerations such as data quantity, frequency, acceptable latency, data integrity, transformation requirements and the processing overhead.

Traditional data acquisition approaches range from scripting, ETL (extract, transform and load) and EAI (enterprise application integration) to TDM (transactional data management). Scripts and ETL are batch-oriented in data delivery while EAI and TDM are continuous.

Attribute
Scripts
ETL
EAI
TDM
Data volume*
Medium
Very high
Low
High
Frequency
Intermittent
Intermittent
Continuous
Continuous
Latency
Medium to
high
Medium to
high
Low
Low
Data integrity
No
No
Guaranteed
Guaranteed
Transformations
Intermediate
Advanced
Basic
Basic

Processing
overhead

Intermittently
high
Intermittently
high
Continuous,
medium
Continuous,
low
* Note that data volumes are relative and depend on a number of factors outside of the technology, such as the data sources in question, network bandwidth and the mechanism for capturing database changes.

Scripts are a quick solution to data integration. However, they pose many challenges, such as drain on developer resource time and effort, in addition to administrative challenges such as manageability, documentation and SLA (service-level agreements) compliance. On the other hand, they are flexible and economical to develop and modify. Almost every operating system and many DBMSs can invoke scripts from built-in scheduling facilities.

ETL is the ideal solution for the initial loading of very large volumes of data. ETL also offers advanced transformation capabilities. ETL tasks are typically executed during maintenance windows, when the data sources are acquiesced to ensure that data sources don't change during data acquisition and lead to inconsistencies across OLTP (online transactional processing) systems and in the data warehouse. The acquiesced state of data sources mitigates the concerns around the overhead of ETL tasks on the data sources. But it means that the data and applications are not always available to the business users that need them.

Originally designed and intended for application integration, EAI solutions have evolved into a real-time data acquisition and integration solution that augments and often co-exists with ETL technologies. EAI solutions continuously deliver data between source and target systems, provide guaranteed data delivery, feature advanced workflow support and facilitate basic data transformations. However, EAI imposes data volume constraints; since the original intent is integrating applications rather than data, EAI is designed to invoke applications and move instructions and messages. Nevertheless, with its ability to move data in real time and maintain the integrity of the data through the integration process, EAI provides the real-time data acquisition capabilities required for some types of operational and active data warehousing needs.

Another approach to real-time data integration is TDM (transactional data management). TDM offers continuous-change data capture and delivery with very low overhead and latency. TDM operates on committed data transactions, captures them from the OLTP system, applies basic transformations and delivers them to the data warehouse. Although asynchronous by architecture, it offers synchronous-like behavior, operating with sub-second latency while maintaining the integrity of data transactions.

EAI and TDM move data changes and updates rather than entire data sets. Neither requires the data sources to be acquiesced because they maintain the integrity of data manipulation language (DML) operations. This significantly reduces the amount of required data movement. While ETL has the upper hand in initial load and transformations, EAI and TDM are a better fit for continuous data acquisition.

Data transformations—where do they belong?
There are a number of advantages to transforming the data within the data warehouse, not the least of which are the massively parallel processing capabilities of the Teradata environment. Ultimately, real-time data acquisition changes the nature of data transformations. With batch processing, data moves between relational and dimensional structures through bulk-loading operations—which means transformations take place either at the data sources or on a centrally located ETL engine. With real-time data delivery, the data warehouse itself becomes the preferred location for staging and transforming the data. The data warehouse is better suited to stage and transform the data in order to reduce data and analysis latency. In addition, this eliminates the need to aggregate data on a centralized server until it is batch processed and it removes an intermediate step from the overall data flow.

Real time—a prerequisite for active data warehousing
Traditional data warehouses have been strategic-only resources that helped create reports, analyze events and predict what might happen in the future. Today's data warehouses are not only strategic but also tactical—adding mission-critical decision support to their workload.

Passage to the operational stage of the data warehouse leads to what Wayne Eckerson defines as the "chasm" (DM Review, November 2004), where the data warehouse starts moving from monitoring business processes to driving the business and ultimately the marketplace. To drive the business and the market, the business has to know what is happening "right now" so that it can determine and influence what should happen next.

In the final stage, the data warehouse goes active. As real-time data feeds the data warehouse and matches pre-defined business patterns, business actions are automatically triggered. The active data warehouse can auto-initiate actions to systems based on rules and context to support business processes.

GoldenGate TDM for real-time data acquisition
A growing number of companies are starting to use TDM technology for real-time data acquisition for their data warehouses. TDM technology captures, routes, delivers and verifies data operations across heterogeneous databases with sub-second latency. With TDM, the instant a data transaction is committed on a source it is captured and delivered to the data warehouse. This approach virtually eliminates the data latency.

In addition to real-time data availability, TDM offers numerous other benefits. Continuous data feed eliminates the dependency on batch windows. In batch operations, data is moved table by table without maintaining referential integrity. With TDM, integrity of the DML operation is maintained. This approach removes the need to acquiesce the data sources while data movement takes place.

Real-time data warehouses in the real world
While the majority of enterprise data warehouses continue to move towards real-time data acquisition, there are numerous organizations that are seeing the benefits of real-time data warehousing today. These companies include a leading financial institution, airline provider and cellular service provider, as well as retailers and e-tailers.

Let's take the example of a major U.S. airline. The evolution of their data warehouse started with many batch data feeds, which provided standardized reports. With the addition of more sophisticated analytical capabilities, reports evolved to provide insights into revenue fluctuations and booking levels across a myriad of marketing segments. Leveraging the data warehouse as a predictive tool, the airline used this information to optimize flight equipment and schedules based on historic activity. In the next phase, the airline started to feed the data warehouse with real-time transactional data. Pre-designed queries and triggers were created to detect significant events such as delays and weather activity. The airline was empowered to not only make more informed strategic decisions but also perform tactical business decisions. For example, if the airline ground staff knows the next incoming plane has 20 premium business-class passengers who are about to miss their connecting flights, they can quickly take the necessary action and find alternate flight options to ensure their passengers receive the excellent customer service they have become accustomed to. Today, because of this access to customer data and through event notification capabilities, the airline can inform passengers of schedule changes, re-route or re-book passengers or optimize arriving gate locations—leading to more satisfied customers and an ultimately improved bottom line.

With active data warehousing, a nationwide grocery chain will ultimately benefit from higher levels of customer satisfaction if it can capture and analyze sales data in real time and send a notification to local store managers if, for example, they must restock immediately to avoid running out of skim milk in the next 10 minutes. Perhaps the store hasn't sold any toaster pastries in the last few hours; the floor staff can be notified to check whether a broken jar of pasta sauce may be blocking access to the shelf housing the toaster pastries.

These organizations continue to push the envelope with active data warehousing. They are achieving measurable gains in their customer satisfaction levels and, ultimately, gains to their bottom line. Perhaps a significant investment into taking the enterprise data operational or active may not be justifiable for all businesses today. However, as businesses become increasingly real-time, the enterprise analytics and data warehousing infrastructures should be built and ready to support faster business decisions. After all, technology continues to compress time and—let's face it—the future is closer than it appears. T

© Teradata Magazine-March 2006


back to top




Copyright by Teradata Corporation 2001-2007.