Register | Log in


Subscribe Now>>
Home News Tech2Tech Features Viewpoints Facts & Fun Teradata.com
Applied Solutions
Download PDF|Send to Colleague

It's all in the process

Implementing a master data management solution framework.

by Mark Shainman

Many organizations understand the value of corporate data and the importance of managing it as an asset. Companies have sought to implement solutions that enable them to more efficiently and effectively leverage this asset for analytical and operational corporate processes. The key component of having good corporate data quality—that which is consistent and accurate—is directly linked to the proper management of master data, which is defined as a company's reference data that is shared across operational and analytic systems and used to classify and define transactional data. This includes customer and supplier lists, chart of accounts, bills of material, and organization, product and customer hierarchies.

Companies face a difficult challenge in keeping master data consistent, complete and controlled across the enterprise—especially when disparate, decentralized operational and analytical systems define and handle master data in different ways. Inconsistent and inaccurate master data can cause numerous problems linked to analytical and operational processes, such as:
Long or poorly executed new product introduction cycle times. Product master data and workflows that are inadequately modeled/synchronized across the enterprise can result in reactive, inconsistent and unpredictable new product introduction programs.
Difficulty determining customer profitability. Multiple customer master data copies, unsynchronized and of poor quality, and data across multiple channels, locations and/or the enterprise can cause redundant customer communications and missed opportunities.
Vague understanding of supplier value. Inconsistent and unsynchronized vendor master data, product master data and bills of material across an organization's departments and divisions inhibit the company's ability to get a holistic view of strategic suppliers. This can lead to inefficient consolidation of spend and reduced leverage with suppliers on pricing and terms.
Inhibited compliance of regulatory requirements. Clean and consistent codes and hierarchies are essential to meeting regulatory compliance and audit requirements. But this consistency is not a guarantee if master data is held in multiple operational systems and/or used for multiple consolidation activities. For better compliance results, data should be centrally located and managed.
Limited ability to create a true integrated enterprise data warehouse (EDW). Consolidating data marts leads to co-location, not integration, of data. Master data inconsistencies affect the overall analytical data quality.

Figure 1: Master data management process framework
enlarge
The master data management (MDM) solution framework provides a map of each stage of the MDM initiative. Each stage requires a focus on different disciplines of data management to reach a managed enterprise data asset business capability.

For years, traditional data warehousing environments have attempted to rectify disparities in master data by leveraging extract, transform and load (ETL) tools, along with custom code and third-party data-quality tools. These solutions, however, are fragmented, inflexible and unable to holistically manage master data. ETL-centric or homegrown attempts also force the responsibility on IT for maintaining and updating the data. Instead, these tasks belong with the domain experts on the business side.

New technologies, such as Teradata Master Data Management (MDM), can help solve these master data problems; however, the process is often more important than the underlying technology. Without the correct process in place, the greatest technology is simply a large capital expenditure with little to no benefit.

When implementing an MDM solution, organizations must consider a solution framework that follows, in many regards, a methodology similar to creating an EDW. (See figure 1, above.) An MDM solution can and should run parallel with an EDW implementation. In fact, to achieve truly integrated data within the warehouse, the MDM solution must be part of an EDW effort and implementation.

Selecting the proper MDM solution requires the combined effort of various members in your organization. It is crucial to put together a team of data stewards and set up the cross-domain data governance rules that are leveraged in MDM. This data governance process must completely overlay the MDM solution framework (see figure 2, below) so that the correct process and business rules are deliberately defined for each phase. This is an important part of establishing the groundwork for a successful MDM initiative, because even with a great underlying technology, governance and process are key.

The governance team must follow four phases to best implement an MDM solution framework:

Figure 2: Key components of governance
enlarge
Each level in the data governance pyramid has different and important functions. These functions foster better performance in the organization when complying with master data management requirements.

Phase 1: Define and profile
A critical component of an MDM initiative is assigning management responsibility for the master data. During this first phase, IT must work with business to explicitly assign data stewardship responsibility to individuals and departments in the organization.

As mentioned earlier, it is crucial for a successful MDM process that this role is shared between IT and the data domain experts on the business side. Since these experts understand the data, they should be tasked with the stewardship of that data.

Once the stewardship roles are established, the master data must be identified, categorized and prioritized. In most cases, the highest priorities are the broad subject areas that have the greatest value and are most widely leveraged by multiple systems and users. While it may be tempting to prioritize the easiest categories first, the easiest does not always carry the most value in the company, so continued sponsorship and funding can be difficult when little value is seen. Instead, defining priorities should be based on value as weighed against risk. An MDM initiative can start small with a single high-value domain, such as customer or product, but the eventual goal and end-state should be a multi-domain, holistic MDM initiative.

To determine what data to manage through MDM, first identify the relevant objects and data elements. Because not all master data is equally relevant, you must consider the following criteria:
Is the master data valuable to the data's consumers?
Is the data currently shared throughout the organization?
Is it possible the data could be shared?

Data that fits within any of these criteria should be considered relevant and, therefore, qualified for data management.

Criteria for evaluating data elements within a subject area:
> How many data elements exist?
> What is the overall lifetime capacity of the data elements?
> Can the data element be reused across shared boundaries within the organization?
> What is its value to the company?
> How complex is each data element?
> How volatile is the data element; how often does it change?
> Which entities should be shared?
> Can the entities be categorized in terms of behavior and attributes within the context of the business needs?

—M.S.

Next, the data subject areas must be defined. While the subject areas should not be limited to specific domains, they often can be based on categories such as customer or product. Best practice would entail defining an overall enterprise model and process for the master data subject areas, even if all the data from all the domains is not initially addressed. The master data sources must also be understood and mapped.

During this initial phase, the governance team specifies the policies and business rules regarding how the master data is created and maintained. The master data can be created within one or more existing operational systems, called a system of record—the place where the master data is created. In an MDM environment, the master data can also be directly created in the MDM application, but most environments will be composed of a combination of both. Maintaining the master data in the MDM application makes it so the MDM solution can also act as a system of record for some of the master data.

This is the phase in which to describe any hierarchies, taxonomies or other relationships that are important to organizing and classifying the master data objects that are to be managed and maintained.

Phase 2: Acquire and enhance
After the data has been classified and the data rules established, the rules are applied. The data integration process occurs in the staging area through workflow processes and administration of the business rules defined earlier. This is the stage in which the governance team examines how the data is extracted from the source systems and staged. Then data-quality functions are performed to clean, rationalize and cross-reference the data.

The extraction process can be completed through multiple methods, but bulk data movement methods (e.g., ETL, flat file) are usually best for the initial staging phase. Based on the organization's business needs, such as real-time extraction through the use of enterprise message bus, and replication, other methods can be leveraged to support the continued update and maintenance required in Phase 3. When considering how to achieve the initial data acquisition process, the following questions should be addressed:
Can the support staff use the tools they are familiar with to provide extracts?
What times are the most favorable for the extract procedure?
How will the data be transported from the source to the target repository?
What interface options does the source system provide?
Is an infrastructure in place that can be easily leveraged to extract the data from the source systems?

Mapping information about the master data during the acquisition and staging process
> Source system. What to collect
> Subject area name. Facet in the logical data model (LDM)
> MDM table name. Table in the industry-specific LDM
> MDM column name. Column in the industry-specific LDM
> Source. The source system name from which the column will be mapped
> Table/file. The name of the table or file containing the operational data
> Column. The column name of the source system
> Rule. Based on the transformation need
> Comment. Any comment required to clarify the rule or uncertainty
> Owner. The person who verified the rule
> Action. Any action required for verification of the rule

—M.S.

The core component of an MDM application is its ability to set up and manage the business rules and workflows that cross-reference, manage and enhance your organization's master data. During this initial data acquisition phase you can use the data-quality components of the MDM application to create the data baseline. You can also leverage the data profiling tools to gain some insight into the baseline data. This information can later be used to help design the data maintenance process.

Phase 3: Manage
When creating the system of reference where the master repository exists, an initial onetime bulk extraction process, from the staging area in Phase 2 to the new repository, might be the method leveraged, but in most cases the master repository is simply an evolution of the staging area in Phase 2.

During Phase 3, a single master repository for master data reference or system of reference is established. System of reference refers to where the "golden copy" of the master data is maintained for reference or synchronization purposes. The repository contains the master reference data (such as customer name, product name and bill of material) and master relationship data (for example, customer and product hierarchies and product-to-supplier relationships).

The organization defines and implements its continual master data maintenance workflows during this third phase. Accordingly, the user interfaces (UIs) leveraged by the data stewards occur as part of the implementation of the master data maintenance workflows. The ongoing maintenance by the data stewards and improvement in data quality should always occur in this phase. Though considered in Phase 2, the update frequency required for the ongoing maintenance of the master data is determined. As data demands adjust based on changing business needs, the update frequency can change as well. In this environment you can track and maintain changes in master data over time, which enables you to monitor and analyze data for trends.

When establishing your data maintenance workflows, consider these questions:
Does your organization have a best-practice life-cycle workflow for master data assets?
How do changes get reflected in the master data repository?
How do you detect and fix master data errors?
Where is a master data element created or deleted?
Do you have business processes defined for data entry and updates?
Are these processes automated in operations?
Do you have an enterprise business process workflow for each line of business involved in the framework?

You must also ask some critical infrastructure questions in the overall support and maintenance of your MDM solution, such as:
What type of system availability does the solution require?
What type of service level agreements must the solution support?
Is there a need for data to be published?

The answers to these questions can help determine the actual infrastructure needed to support your organization's solution.

Phase 4: Use
The clean and accurate master data is used in this fourth phase. You must examine how the data is going to be used and how frequently it will be accessed. In the case of an EDW implementation, consider how the EDW will consume the master data within the analytical environment. Since the MDM repository and the EDW exist on the same platform, the clean and accurate master data, cross-reference tables and hierarchies can be easily published to the data warehouse using numerous methods, ranging from ETL tools to the INSERT SELECT SQL statement.

Figure 3: Flexibility of the physical data model
enlarge
Teradata allows flexibility in the physical model and how the data is used. Initially, master data management tables can be dedicated tables in their own database, but they can also be views into tables in the larger enterprise data warehouse.

Duplication of the master data within the master data tables to the warehouse is not always necessary. Based on how the master data record is updated and the update frequency, you can simply create a semantic view on top of the MDM table to access for analytical purposes without duplicating the data. (See figure 3.) This is just one benefit of the MDM repository co-existing on the same platform as the EDW.

Just as you must examine how the EDW will use the master data, for a broader MDM initiative you must also understand how it will be leveraged by other systems. If any operational system's local data repositories need to subscribe to the enterprise MDM solution, you must determine how to update those master data repositories. You will need to verify the existing infrastructure, such as an enterprise message bus or replication technology, to substantiate the update. Also note the types of adaptors or environmental requirements that exist to update to those applications.

With Teradata MDM the MDM processes can be exposed as Web services, enabling the application to be used not only as a publish-and-subscribe method but also to directly access the master data. This allows new applications coming online to, in some cases, forgo a local master data repository and simply leverage a Teradata MDM service for its master data needs.

MDM benefits
By leveraging a powerful MDM process in conjunction with a robust MDM technology, organizations can solve their problem of inconsistent and inaccurate master data. They can improve their EDW environment, increasing the overall quality of the data in the warehouse and the accuracy of master data throughout the enterprise.

MDM is the right solution for companies that wish to lower costs, increase their technological and informational scale, improve business agility and strengthen enterprise decision making. T

Mark Shainman is the global program manager of the Teradata MDM solution. He also manages the Teradata Oracle Migration Program and works on the company's strategy and market analysis teams.

Teradata Magazine-March 2008

More Applied Solutions

Related Link

Reference Library

Get complete access to Teradata articles and white papers specific to your area of interest by selecting a category below. Reference Library
Search our library:


Protegrity

Teradata.com | About Us | Contact Us | Media Kit | Subscribe | Privacy/Legal | RSS
Copyright © 2008 Teradata Corporation. All rights reserved.