Register | Log in


Subscribe Now>>
Home Tech2Tech Features Viewpoints Facts & Fun Teradata.com
Why Teradata
Download PDF|Send to Colleague

Mold the right approach

Dimensional, normalized or hybrid data models show different levels of relationships.

by Dan Higgins, Director of data warehouse sales support for Teradata

In data warehousing, one of the most contested issues regarding the best approach for problem solving is data modeling. For years, proponents for the dimensional, or star schema, have argued for their methods to modeling warehouse data. Others swear the normalized model is more revealing. However, as is true with most opposing views, both have merits and challenges.

Mold the right approach

Dimensional and normalized modeling approaches have a place in data warehousing and other decision support environments. In fact, some Teradata customers use normalized modeling exclusively; some use only dimensional modeling; many use a combination of the two; and some develop hybrid data structures.

It's clear that organizations with successful data warehouse systems are open to redesigning and redeveloping their models based on their needs. A company's data architects and modelers do not choose a data modeling approach simply because it is what they have always done, or because of architectural dogmatism, or because of the limitations of a particular technology. They choose a particular approach because it's right for their system, it's adaptable, it's scalable—and it meets the business's requirements.

To get to that point ourselves, we need to understand the roles of the different data modeling approaches and exploit the most right method, or blending of methods, for present and future needs.

Better than normal
Some users in the IT community assert that a normalized data model is appropriate only for transaction processing systems. Others say this model is also useful for data integration hubs, which feed the downstream, dimensionally modeled data marts that support the needs of the end users. And some refer to dimensional modeling as data warehousing and normalized modeling as online transaction processing (OLTP).

Together with hundreds of its customers, Teradata recognizes that the normalized approach is useful in enterprise data warehousing for modeling business data and enabling end users and applications to have direct access to the normalized data model. The benefits of normalized data modeling include:
Superior cross-functional, cross-subject-area data integration
Easier data maintenance and reduced data redundancy
Richer and more extensive set of relationships among data entities
Wider range of support for applications and varying needs of different business functions

Another dimension
The dimensional, or star schema, data model may be more appropriate for specific decision support applications. While a dimensional approach often eliminates many of the valuable relationships that exist among data entities, it also removes much of the complexity inherent in a normalized model. This simpler representation of the data can enhance end users' comprehension and increase their utilization of the system.

With an underlying normalized data model, a virtual or physical semantic layer based on dimensional models is useful for hiding the complexity of a normalized model and supporting a wide range of basic reporting and query applications. For this reason, many online analytical processing (OLAP) business intelligence (BI) tools assume a dimensional cube structure.

Stay flexible
Many Teradata customers ask: When building an enterprise data warehouse (EDW) that will eventually support a wide range of applications and end user needs, which approach should be used, normalized or dimensional? To provide answers based on direct subject-to-subject relationships as well as the relationships among several entities, companies can leverage both tactics.

A normalized data model should be used for the underlying data that is integrated and shared across the enterprise. The outcome will serve as the raw material for application-specific database views and table structures. (See figure right.)

For example, a marketing department that is planning a product promotion might use a normalized approach. Among other things, the department would need to know the demographics of its intended consumer audience, the best salable time for that product and the most effective means to communicate its advertising message.

Dimensional models should be used for OLAP tools, for some reporting applications and when a simpler, more straightforward presentation of the data is required for end users who need to understand the relationships between two sets of data. Logical database views and features such as aggregate join indexes will help minimize the maintenance of dimensional structures while balancing system performance.

The dimensional model would be the appropriate tool, for instance, when the finance department simply needs to identify the number of a particular account sold.

Incorporating the complementary methods of normalized and dimensional data modeling into the EDW processes will deliver the greatest business value to any organization. T

The value of relationships

The most valuable insight in decision support solutions comes not from data about "things" but from information about the relationships among those things. For example, how customers and products relate to one another is more valuable than information only about customers or only about products. The information about things can tell you what happened. The information on the relationships among things can help you understand how and why it happened and is more useful in predicting what will happen.

Normalized data models can accommodate far more relationships among data entities than is possible in a dimensional model. This relationship capability allows greater flexibility when modeling data and, more important, allows the model to answer a much broader range of business questions. A normalized data model also eases the data transition from operational systems into the data warehouse and provides greater opportunity for reusing data in new applications, as well as faster deployment of those applications.

Consider the dimensional (or star schema) and normalized data model diagrams in the figure. In both diagrams, the lines represent relationships between data entities. The more basic dimensional model is effective for some types of hierarchical reporting and can answer business questions that fit a specific star or cube. The normalized model includes a far richer set of data relationships and can answer questions about the intertwining relationships.

—D.H.

Why Teradata for all types of data modeling

Teradata has long been a proponent of normalized data modeling, in part because of its suitability for enterprise data warehousing applications but also because the Teradata platform is superior for managing and exploiting the normalized data model. But Teradata is not exclusive to the normalized approach; rather, the design of the Teradata platform makes it an excellent choice for the full range of data modeling methods, including applications using a dimensional, or star schema, data model.

These features make Teradata an adaptable system for any type of modeling:
> Database architecture is not biased toward any particular data modeling method.
> System performance and scalability come from its parallelism and software-based shared-nothing architecture.
> Query rewrite capabilities can comfortably handle complex joins associated with a normalized data model.
> Database views can be used to define a virtual dimensional model, thereby exposing only the appropriate data multi-dimensional elements and relationships to a particular business intelligence (BI) user or tool.
> Virtual dimensional models can camouflage complexity sometimes associated with normalized data models.
> Referential integrity, case logic, table elimination and optimization strategies enable the virtual dimensional model to become reality/physicalized.
> Aggregate join indexes (AJIs) provide easy and automatic maintenance of cube structures.
> AJIs eliminate costs associated with building cubes, such as large-server maintenance, large data transfers, data latency and limited dimensionality for data analysis.
> Continuous improvements made to industry-specific normalized logical data models (LDMs) minimize years of customized design work.
> Expansion of a comprehensive set of LDMs will include multi-dimensional data models.

—D.H., with contributions from Carlos Bouloy

Photography by Getty images

Teradata Magazine-June 2008

Related Links

Reference Library

Get complete access to Teradata articles and white papers specific to your area of interest by selecting a category below. Reference Library
Search our library:


Manthan

Trillium

Protegrity

Teradata.com | About Us | Contact Us | Media Kit | Subscribe | Privacy/Legal | RSS
Copyright © 2008 Teradata Corporation. All rights reserved.