Register | Log in


Subscribe Now>>
ARCHIVE: Vol. 6, No. 3
Home News Tech2Tech Features Viewpoints Facts & Fun Teradata.com
Teralinks
Send to Colleague

IBM+Websphere

Help your business avoid common data management project pitfalls with a Data Integration Center of Excellence.

by Andrew Manby

Whatever the reason—meeting regulatory compliance, centralizing purchasing to achieve lower costs, gross margin reporting—today's businesses need to create and leverage a holistic, reliable, enterprise-wide view of their data. In fact, across industries, data mastery is quickly becoming a competitive necessity.

Many organizations in pursuit of these business initiatives are rationalizing and re-assessing the value and composition of their entire data inventories. Their findings are often quite disturbing. Many discover, for instance, that their enterprises suffer from data fragmentation, duplication, poor data quality and disintegration of their major business entities—such as customers and products—across multiple application silos. This general lack of data governance often indirectly leads to failed projects, missed opportunities, dissatisfied customers and higher operating costs.

Why a COE?
A Data Integration Center of Excellence (COE) can help transform how organizations use and maintain their data inventory. It also provides a repeatable framework of best practices, suitable involvement of IT business representatives and holistic use of data integration technology across multiple data-centric projects.

The primary objective of a Data Integration COE is to support a sustainable corporate data governance program, which sets the standards and patterns of data use. This includes ensuring specific rules for security and monitoring compliance across the enterprise. In addition, the Data Integration COE should significantly cut the delivery time and costs of all data integration projects compared to task-specific manual coding.

The Data Integration COE discipline mandates the use of a consistent methodology throughout the project life cycle, involving both business and IT professionals. While the project may use a blended services model involving employees as well as offshore and onshore services, the goal is to create an environment of self-sufficiency. This includes a mandate for staff mentoring and the creation of a best practices knowledge base suitable for reliable reuse across multiple projects.

Asking the right questions
Data Integration COE methodologies help organizations avoid common data management project pitfalls. The following questions provide a continual reference point that should be considered as the definition of a project's objectives and blueprint, design, and technical deployment. Failure to do so may perpetuate ineffective practices of the past and create a dubious foundation for the entire project.

Completeness
Has the Data Integration COE determined the availability of approved data sources that will become the people (customers, partners), places (territories, regions), and things (products, assemblies) needed to support the application or person using the data?
How is data organized in each/all enterprise applications?
What data is going to best fit the organization's needs?
Is the content consistent with its structure (data type)?
What business entities are necessary for a successful implementation?
Does the key data even exist?

Validation
Has the Data Integration COE established a standard for structural and semantic integrity, consistency and relationships between source tables and their relevant business rules?
Would the data be more effective if it were cleansed?
What do the business codes in the lookup tables mean?
What text fields will need mining for key facts?
Does the data represent something the business understands?
Can the data be validated against real business entities; e.g., a customer order?

Accuracy
Has the Data Integration COE constructed a single version of the truth that conforms to the facts about real-world business entities?
Do business entities mirror the real-world collection of people, places and things?
Can the data be validated against external sources?
Is data being duplicated unnecessarily?
Is the data consistent across sources?
What operational systems and integration layers would benefit from the same reference data?

Relevance
Is the Data Integration COE able to deliver to the business meaningful information as a packet of business entities in an acceptable timeframe?
Will this information support every new purpose?
Does this information meet external standards—HIPAA, Charter Set of Accounts and business partner requirements?
Is this information timely and current for business processes and purpose?
What constraints on accessibility should be in place?

Agreement and closed loop
Is there an established consensus among businesses and IT about the use of organizational data? Has the Data Integration COE provided effective data quality thresholds to maintain the integrity of the information?
Is the descriptive information provided about the data meaningful to end-users?
How does the Data Integration COE assess the utilization of data warehouse solutions?
What metrics are attached to each business entity?
What is the business impact of a given level of quality for each business entity?
How does the Data Integration COE monitor, audit and manage events related to data processing?

Laying the foundation
After an organization builds out the people and processes that are necessary for a successful implementation, it must direct its attention to the technical foundation of the Data Integration COE—the data integration suite. The reason for having a suite approach to data integration is two-fold. First, it gives an organization the best possible solution to deal with a broader set of data-centric problems in terms of complexity, volume and variety of data across analytical, transactional and operational systems. Second, the organization works with a single data integration partner that is focused on delivering value on an enterprise-wide scale.

Enterprise data integration vendors have a common heritage and a focus on enabling the business to preserve the semantic integrity of a corporation's data. In order to deliver more value, a data integration suite is defined as a software platform capable of:
Delivering data profiling, data quality and data transformation capabilities on a common parallel architecture with a cross-platform metadata foundation, and service-oriented architecture (SOA) for on-demand usage
Processing complex, deeply nested structures requiring iterative, conditional processing common in EDI and the most sophisticated XML-based industry standards, without programming or multiple passes of the data

As part of the suite, data profiling and auditing tools establish an initial characterization of the data by managing metadata to describe every data element, such as a customer, and its interrelationships with other data, such as products, sales information, and territories. Thereafter, the suite can continually monitor and report the implications of data-level exceptions and enforce precise business rules where necessary. Alternatively, it can alert the appropriate business functions that an error has occurred, requiring intervention by a business analyst. This level of insight is invaluable to the timely delivery and ongoing monitoring of the entire process. If complex data re-engineering is required, organizations must rely on data cleansing and sophisticated matching techniques to ensure data quality.

In any data integration application, the solution's ability to describe itself to the business user or technical practitioner is vital. A strong foundation of cross-solution metadata management reduces the development and ongoing maintenance complexity that comes from involving data modelers, database administrators, business intelligence (BI) experts, data stewards, data administrators, business users and analysts, and ETL developers, all within the same development process.

Reaching your goals
At the inception of any data integration project, the project team must pay close attention to the people and process involved as well as the technology. This includes the business stakeholders and stewards that provide guidance and accountability as part of an overall corporate governance mandate. The Data Integration COE is a proven approach to achieving this goal.

Specifically, a Data Integration COE helps to document and validate the business requirements against the actual data inventory and the actual business processes that will use the data. When a solution is correctly implemented using a single, integrated data integration platform, a Data Integration COE will help companies deliver strategic and tactical data to business users on their terms. A Data Integration COE working in concert with business leaders embodies all the necessary knowledge, best practices and methodologies to achieve a repeatable process that saves money and enables mastery of an organization's data. T

Andrew Manby is an expert in information management business strategy for IBM Information Integration Solutions.

Teradata Magazine-September 2006

More Teralinks Articles

Reference Library

Get complete access to Teradata articles and white papers specific to your area of interest by selecting a category below. Reference Library
Search our library:

Teradata.com | About Us | Contact Us | Media Kit | Subscribe | Privacy/Legal | RSS
Copyright © 2008 Teradata Corporation. All rights reserved.