Data gravity: large data sets attract smaller data sets

The effect of data gravity and the cloud

Data gravity highlights the importance of data in our modern economy. Organizations are facing a huge growth in enterprise data due to data-intensive applications like the Internet of Things (IoT) and Artificial Intelligence (AI). No matter the industry, data continues to grow within companies as they move forward and it is becoming increasingly strategic.

What is data gravity?

The term “data gravity” refers to the ability of data to attract additional data, applications and services as data sets grow. It is related to the intensity of data growth as well. 

Dave McCrory, VP of Growth, Global Head of Insights and Analytics at Digital Realty, coined the term “data gravity” to describe how the attraction between data is directly proportional to the data sets’ “weight” or size; just as the Law of Gravity states regarding physical objects. According to this concept, large data sets tend to attract smaller data sets, as well as IT systems, applications and services related to said data. 

The larger the data set is, the greater its gravitational pull is and the more applications, services and data it is likely to attract.

The phenomenon of data gravity highlights the importance of analyzing how many applications, services and additional data might be attracted to a data set, to avoid issues like latency and portability in the future. By considering data’s potential growth, businesses will be able to design more accurate solutions to efficiently use that data.

How to face the negative effects of data gravity

Latency, portability, costs, security and regulatory compliance are some of the issues data managers will need to face regarding the data gravity effect. On this matter, choosing suitable cloud and connectivity solutions is critical for leveraging its benefits.

Overcoming these challenges is essential to leverage Big Data for better data analysis and application to Artificial Intelligence and Machine Learning.


To avoid the negative impact of data gravity, companies should locate data closer to where it is created, processed, exchanged, etc. Keeping applications and systems closer to data is especially important to avoid latency issues when working with large data volumes. 


Besides, to guarantee scalability and data consolidation it is recommended to design your company’s IT architecture around a scale-out NAS platform, such as Stackscale’s Network Storage based on NetApp AFF.

A scalable storage solution is essential to avoid performance issues as data volume grows. Nevertheless, it is also key to ensure that storage costs do not rocket as capacity increases. In other words, a x5 increase in data storage capacity should not entail a x5 increase in costs.


Companies should also design migration plans considering their data’s potential growth — instead of the data set’s current size. Data sets are continuously growing and it can become costly and hard to migrate to a different vendor in the future.

Although everything works fine when starting with a vendor, businesses might need to find more suitable alternatives over time.  Therefore, considering data portability from the beginning is important to avoid vendor lock-in. 

Some organizations opt for a multicloud strategy to reduce vendor dependency. However, that approach often requires duplicating data sets for data analytics, thus adding costs and management complexity, especially in data access.

Moreover, security, costs and regulatory compliance should also be carefully considered all along the process. On this matter, private cloud solutions can help both keep applications closer to data and meet strict security and regulatory compliance requirements; as well as cost-efficiently scale resources as a data set grows. Besides, it is important to rely on a backup and Disaster Recovery solution to maximize data protection and business continuity.

Trends that amplify the data gravity effect

According to a Digital Realty’s report, there are five macrotrends that are amplifying the data gravity effect:

  1. The increase in the number of companies that are starting to manage its corporate data.
  2. The increase in company mergers and acquisitions as a consequence of globalization.
  3. The growth in number and importance of digital interactions.
  4. The criticality of data location to comply with data protection and sovereignty regulations.
  5. The combination of physical and digital security systems to improve cybersecurity.

To sum up, data has a huge potential to help businesses improve, grow and innovate. So, data management, governance and integration are essential to leverage the benefits of the data gravity effect, as well as to mitigate its negative impact.

Share it on Social Media!

Network Storage

Bottleneck-free storage access by leveraging high-capacity 40 Gbps links in every single computing node.