By integrating massive amounts of data from diverse sources in ways that are broadly accessible, businesses can an profoundly transform an organization’s culture when it comes to data accessibility. This can make analysis fast and agile, ensuring that business users and data scientists alike arrive at answers that matter most.
As information streams in from people, products, and things at a rapid rate in today’s digital environment, a key challenge in data warehousing is capturing, combining, and analyzing so many diverse kinds of data. That’s the objective of the integrated data warehouse (IDW), a centralized store of detailed and summary data that effectively brings together multiple subject and departmental areas to provide a 360-degree view of a functional area within a company.
Two Ways to Integrate Data
When building systems for integrating data, system architects generally adopt one of two philosophies developed by data warehouse design trailblazers William Inmon and Ralph Kimball:
- The Inmon or “top-down” approach identifies the subject areas and entities that the business operates, such as customer, product, and vendor. A detailed logical model is formed for each major entity, and all data is integrated and defined upfront. This model makes loading data less complicated but structuring queries more difficult.
- The Kimball or “bottom-up” approach identifies the key business processes and questions that the data warehouse looks to answer and creates data marts to meet these needs. Data is loaded into a staging area but isn’t as tightly coupled to entities as in the Inmon model.
A key strength of the Inmon approach is that the data warehouse serves as the single source of truth and the place where all data is integrated and standardized. The Kimball approach lacks this centralized standardization, since data is not fully integrated before queries are made. However, the Kimball method allows the data warehouse to be built quickly and applied efficiently to business applications.
Teradata advocates the “top-down” approach. This way users always have query freedom. They are able to ask any question of any data at any time. Data Warehousing by design encourages inspection drilling down to the next question to understand the root cause. In the Kimball approach a user might be required to go back and redesign to be able to answer key questions ultimately causing the a longer time to get to specific answers.
Why an Integrated Data Warehouse (IDW) Matters
By integrating massive amounts of data from diverse sources in ways that are broadly accessible, businesses can:
Share information across functional units
Disparate sources of data gather in one place, reducing data siloes that may exist at an enterprise and ensuring data consistency.
Provide vital business answers quickly and accurately while incorporating multiple viewpoints
Efficiently gain answers to the toughest business questions so decision-makers can make the right strategic choices.
Show a single version of the truth
Allow everyone in an organization to work from the same information to draw conclusions about the state of the business. This level of universal access relieves pressure on IT as more citizen data scientists can add value.
These capabilities can profoundly transform an organization’s culture when it comes to data accessibility. Rather than limiting users’ access to data and stymying innovation, a well-designed IDW can make data available securely and in the right formats for users’ needs. This can make analysis fast and agile, ensuring that business users and data scientists alike arrive at answers that matter most to the business.
Evolution of the Integrated Data Warehouse (IDW)
While the idea of a data warehouse first took shape in the 1960s and 1970s, a groundbreaking moment came in 1988 when Barry Devlin and Paul Murphy wrote about a need for an “integrated warehouse of company data” that could “draw together the various strands of informational system activity within the company.”
For decades, computer scientists have rigorously studied the optimal ways to build this kind of large-scale platform. In 1991, the Integrated Public Use Microdata Series (IPUMS) demonstrated the power of using a data warehousing approach as they adopted the extract, transform, and load (ETL) method to integrate data from diverse sources into one system.
As data exploded with the expansion of the Internet, emergence of smartphones, and rise of cloud computing, ETL became less feasible for data sets that had to be constantly updated. Data hub and data lake approaches have since emerged, pooling unstructured data together without requiring tightly coupled relational data processes.
For more than four decades, Teradata has been at the forefront of IDW design and development. And today, Teradata continues to innovate in the field as we apply best practices to Teradata Vantage.
Teradata Vantage is the leading hybrid cloud data analytics software that leverages 100% of your data to analyze anything, anywhere, at any time. Combining the power and ingenuity of the IDW with the flexibility and scalability of the cloud, Teradata Vantage is built and priced for industry-leading performance at scale. It’s simple to use and integrate with your current systems, and it provides flexibility and control no matter your needs or what evolving technologies are available.
Curious about what Teradata Vantage can do for you?
Learn more about Vantage