What is Data Gravity?
Data gravity appears when the amount of data volume in a repository grows and the number of uses also grows. At some point, the ability to copy or migrate data becomes onerous and expensive. Thus, the data tends to pull services, applications and other data into its repository. Primary examples of data gravity are data warehouses and data lakes. Data in these systems have inertia. Scalable data volumes often break existing infrastructure and processes, which require risky and expensive remedies. Thus, the best practice design is to move processing to the data, not the other way around.
Data gravity has affected terabyte- and petabyte-size data warehouses for many years. It is one reason scalable parallel processing of big data is required. This principle is now extending to data lakes which offer different use cases. Teradata helps clients manage data gravity.