How To Escape Data Gravity in a Hybrid Cloud Architecture

In accordance with the physical laws of gravity, objects with larger mass attract those with less. In this analogy, bigger data sets will pull closer applications, services, and smaller data sets due to latency and throughput.

Pablo Iorio
3 min readMay 10, 2021

Why is this concept important?

Data gravity is an interesting analogy because it is easier to move the compute around (applications and/or services) rather than moving petabytes between data centers. Let alone if it requires a data migration from one schema/format to another. When planning the data center strategy, this needs to be considered.

As business data continues to become an ever-increasing commodity, it is essential that data gravity be taken into consideration when designing solutions that will use that data. One must consider not only current data gravity but its potential growth. Data gravity will only increase over time, and in turn, will attract more applications and services. [1]

The latency/throughput trade-off

When applications and services are closer to the data they need, users experience lower latency and applications experience higher throughput, leading to more useful and reliable applications and services. At first glance, it appears convenient to have applications and services closer to big datasets to reduce latency and increase throughput. However, this would force you to either on-premise or full-on one cloud vendor.

Data gravity and vendor lock-in

I prefer to view this topic as a high migration cost rather than lock-in because some form or shape of lock-in is inevitable. At the end of the day, you need to rely on partners and even if you build everything from scratch, you would rely on a programming language, a database, something, etc. Lock-in is not a simple true-or-false matter:

avoiding being locked into one aspect often locks you into another

Data gravity and hybrid cloud

If you have the option to go full on-premise/co-location, or you started all in the cloud, then you have less influence of gravitational forces.

Complexity increases when you want to achieve either hybrid (on-prem + public cloud) or multi cloud (different public cloud providers)

Hybrid cloud is great to achieve agility on the applications and services running in the cloud and support on-premise systems like mainframes. The need to adapt and change direction quickly is a core principle of a digital business. Your enterprise might want (or need) to combine public clouds, private clouds, and on-premises resources to gain the agility it needs for a competitive advantage. Also, a multi-cloud strategy leaves options open and “avoid lock-in”, however, we can assume some level of lock-in is inevitable, as I described before.

From a compute angle, there are more options available; the situation gets problematic as soon as we need to move large datasets. Enterprises can take steps to mitigate the negative impact of data gravity through proper data management and data governance.

How do we deal with data gravity then?

My view is that the answer is not purely technical. We will need several processes in place to resolve it:

  1. As part of Enterprise Architecture, IT should have an Enterprise data strategy that facilitates 2 IT disciplines: data governance and systems integration.
  2. Data governance defines who has authority and control over data assets and how those data assets may be used. It encompasses the people, processes, and technologies required to manage and protect data assets.
  3. Systems integration is the process of connecting different sub-systems (components) into a single larger system that functions as one.

Once these 3 disciplines are working together, hybrid cloud architectures are feasible and will allow the organization to escape the pull to keep everything in the same Data Centre or cloud provider, hence enabling agility in development and delivery.

References

  • [1] Data Gravity: What it Means for Your Data
  • [2] Data Gravity — in the Clouds by Dave McCrory
  • [3] What is hybrid cloud?
  • [4] Don’t get locked up into avoiding lock-in by Gregor Hohpe

Disclaimer

This is a personal article. The opinions expressed here represent my own and not those of my employer.

--

--

Pablo Iorio
Pablo Iorio

Written by Pablo Iorio

I enjoy thinking and writing about Software Architecture. To support my writing you can get a membership here: https://pablo-iorio.medium.com/membership

No responses yet