Register | Log in


Subscribe Now>>
Home News Tech2Tech Features Viewpoints Facts & Fun Teradata.com
Fresh Perspectives
Download PDF|Send to Colleague

Hills, peaks and valleys

He who takes the wrong road makes the journey twice. —Proverb

by Rob Armstrong, Director of data warehouse support at Teradata

It has often been said that data warehousing is a journey. When you're on that journey, you will pass through Report-ville, Cube-city, Analytics-burg and Predictive-place and head to Triggered-town on your way to the bustling metropolis of Active Data Warehouse.

Like any journey, when developing an active data warehouse, it helps to chart your path; you'll want a general idea of the route you are going to take through the hills of data consolidation to activation.

So just what are the hills, peaks and valleys on this journey? The peaks are the major towns mentioned above where you'll stop, with the uphill climbs being the various reinvestments in your data warehouse environment as you move forward. You want to avoid the downhill jaunts, side roads and dead ends—these are the pitfalls when you stop evolving in the data warehouse process and start losing ground.

With that in mind, let's start traveling along the road to Active Data Warehouse.

Data mart consolidation

"We're all entitled to our own opinions. But none of us can afford to be wrong in our facts." —Mort Crim

Many companies start their journeys with data spread across the organization, redundant or inconsistent data—even data that is unaccounted for. Instead of having a warehouse they have a "where-house"! The general consensus is that consolidation would allow users to access it with greater ease.

While true to a point, the act of bringing the data together actually begins the process of identifying what problems must be overcome. Data consolidation will give the users a quick boost. No longer will they need to search multiple places or move data around for joining. Whatever potholes the consolidation effort uncovers, it is an important first stop on our journey.

Of course, the real questions are which data should be consolidated and what value does consolidation bring to the corporation.

For the consolidation effort to be successful, it must be based on business needs. You must first determine which data should be brought together, then find the current locations of that data. Ultimately, you will understand the extent of your data redundancy or inconsistency and begin the process needed to ensure data quality.

The danger is thinking the journey is over once you reach this point. Simply consolidating the data will highlight the data inconsistency in definition and timing. As the users start to join data together more efficiently, they might question the accuracy of the data, or they could have more problems running analytics because of the inconsistency in data types, definition or content and, therefore, might question why they even started the trip.

This is good if it is seen for what it is: identifying the problems that must be overcome to establish a sound data and information management strategy. However, oftentimes the inconsistency is not addressed, leading users to mistrust the data warehouse. Consequently, users are prone to return to their silo systems, thereby diminishing the data warehouse value.

Now is not the time to end your journey. Regardless of the impending hills, you must now gas up the car and drive to the next stop.

Data integration

"A man with one watch knows what time it is. A man with two watches is never sure."
—Segal's Law

As mentioned earlier, the problem with (or benefit of) consolidation is that it highlights the real data management and usability issues. The point to take away is that consolidation is not the same as integration. This has been a frequent misconception I've been hearing among customers over the past year.

Once the data is consolidated the next step, which must be committed to from the start, is integration of the data model. Integration will provide analytical consistency, ease of extract, transform and load (ETL) processes, and time-to-market reduction. The foundation is now laid for sustainable data warehousing—including going active. Data consolidation is about technical return on investment (ROI); data integration is about business ROI.

You're now at the long haul of your journey where work needs to be done to set a direction and roadmap on data management, quality and security, and bring a consistent set of data to drive the total organization. Unfortunately, many people are afraid to take this leg of the trip: The drive is long, and often navigation must be made around or through roadblocks. Governance and leadership are critical here to give direction and make people more comfortable with the concept and process. The goal of data integration is one that cannot be missed, and there is no shortcut to this destination.

Just as you do not automatically jump to a destination, you cannot achieve data integration in a single bound. It is a cyclical journey, driven by business need, data availability and ROI. A thoughtfully planned course is necessary to keep short-term benefits in line with long-term objectives.

Once at the integration oasis, you will start to take some day trips. As you get comfortable with the integration of data elements and understand the impact to cross-functionality and processes and actions that cut across the company, you will discover what data is missing or which data could be used to complement the existing data warehouse.

Data expansion

"The road to success is always under construction." —Lily Tomlin

Now, one of the tricky parts of the journey is deciding which day trip to take next. Like a family road trip where everyone has a desired attraction to visit, the business and IT communities will certainly have preferences as to which data subject area should be next in line. Of course, not everything can be first. Just like our family road trip, priorities must be established and decisions must be made.

The real question of what data comes next is based on two key factors: cost to acquire the new data elements and their ensuing benefits. Does the data readily exist, and is it in a fairly "clean" state? Will there be many transformation issues to get the current as well as historical data? Does the current system have capacity for this subject area? These are all questions that must be asked from a cost perspective.

From the benefit side of the equation, you want to understand the value of incorporating the new data. Will the new subject area align well with our current data areas? What current analytics will be enhanced by the new data? What new capability will be enabled, and what key performance indicators does it tie back to in our company objectives?

The results of this analysis can help you prioritize the next subject area and phase. The added bonus is that by understanding the benefit part of the equation you can start to set business milestones and measurements into the evolution of the data warehouse implementation.

While I have presented this as a linear journey, you will actually end up making a few side trips between integration and scope in order to extend the data warehouse and its capabilities throughout the organization. One point to make is that, like many families taking road trips, not all business units will have to "travel as one." Some may choose to have more expansion than other business units before heading onto our next big stop.

Data acquisition

"It is not so much what you know anymore that counts, it is how fast you learn."
—Robert Kiyosaki

Much like the data expansion, the frequency and timeliness of data is like a day trip. When you look at what new data can be of value, one dimension to consider is the timeliness of that data. Can the benefit increase as the data acquisition becomes more real time? Which data is most beneficial is based on a correlation between the data's timeliness and level of detail, and its cost and value to the organization.

Timeliness is often thought of as having the data in real time. This is not normally the case. Timeliness has two aspects: frequency and granularity. The data's business usage and value justify how frequently the same level of data is collected and at what scope of granularity.

The question is whether having the same data sooner would enable you to make the same decision sooner and, ultimately, reap a greater or quicker benefit. This depends on your ability to respond to information. If you can take inventory action only once a day, then whether you know about an outage at 8 a.m. or 8 p.m. is irrelevant. However, if you can address inventory requirements throughout the day, then receiving the information earlier makes a great difference.

Regarding the data's granularity, the level of data you acquire helps to determine the degree of analysis and subsequent decision making. You may load the data only once a day, but rather than a daily aggregation you provide hourly transactions. With more granular time divisions, you can readily spot trends and see whether incorporating changes in your processes and responses makes better business sense.

Welcome to activation!

Now for a quote of my own: "If you are not taking action, then stop making decisions."

You've arrived at the metropolis of Active Data Warehouse. As you have traveled through the integration and timeliness points in our journey, the ultimate outcome of data warehousing is that actions are triggered by the data itself without requiring constant intervention. It is only the exceptions that need attention, and the data will alert us to those situations.

The road to active is long and shortcuts must not be taken. You can successfully navigate your way by maneuvering through all of the other phases. As you understand what data is meaningful to the processes and the timeliness those processes require, only then can you move forward.

The important message is that data-driven actions will also drive new data points. And remember, this is an ongoing adventure—the Active Data Warehouse metropolis is just a stone's throw from active enterprise intelligence. T

Teradata Magazine-June 2007

Related Link

Reference Library

Get complete access to Teradata articles and white papers specific to your area of interest by selecting a category below. Reference Library
Search our library:
Manthan
Trillium
Protegrity
Teradata.com | About Us | Contact Us | Media Kit | Subscribe | Privacy/Legal | RSS
Copyright © 2008 Teradata Corporation. All rights reserved.