How to Decrease Spaghetti Data Integration

January 14, 2020 | Data Management | Slavko Kastelic

Spaghetti Data Integration is a result of Data Fragmentation. Cloud computing services have blessed us with greater accessibility and availability of software solutions. But they have also created a big problem – additional process and consequentially also data silos and integration challenges. And that is not good for modern business.

Around 35 years ago Tetris emerged – a simple yet sophisticated game where the game was over when one could no longer keep up with the increasing speed and the tiles overflow in the playing field.

Today, many business processes are very much like playing Tetris. To keep them alive, you need to combine different applications to satisfy customers, suppliers and/or management. The higher your position is, the more elements you need to handle and make decisions faster. Success is defined by the ability to puzzle together all the required applications and survive the complexity associated with a central feature of modern organizations: fragmentation.

Like Tetris, fragmentation offers the potential of infinite game-play. Unfortunately, in most organizations, it leads to a loss of meaning and decreased levels of engagement.

A typical corporate environment is using more than 900 applications. I had a meeting recently, with one large international pharmaceutical company that uses more than 4.200 different applications. Even the CIO, when asked, replied that the number of applications the company uses is around 40 – and only then he started counting. That is a perfect example of Spaghetti Data Integration.

Losing your focus?

A study showed that 68% of workers toggle more than 10 applications per hour, losing 60 minutes per day or 24 days per year in the process. Things do get more serious, as 31% of workers already admit that they are losing a train of thought because of such fragmentation. Because these partial tools don’t work well together and support only one or two steps of the process, getting correct data and/or inputs from one place (application) to another and another and another resembles watching a football game in low resolution on a small screen. Employees can’t see the ball and so they drop it. and Forbes put it clearly: to survive, companies will have to address the mass data fragmentation and data silos at all costs. 

Data Integration is a football game-like experience

All applications need proper integration into the business environment. And hundreds of applications mean 1.400 integrations between them are taking place in an average enterprise and this number is only increasing. The situation is out of control, which is very apparent.

In football, even the best players don’t guarantee a win. If their efforts are not in sync, correctly orchestrated and managed, the result can be a disaster. The same is true for data integration as well. You can find different teams, use different methodologies, frameworks, technologies, standards. Most of the time this will result in errors, lineage, governance, and security problems. The integrations are unmanageable.

It doesn’t look like the number of applications and related data silos will reduced significantly anytime soon. On the contrary, it is expected to increase with even higher speed in the future.

The only way to control the information mess is to focus on integrations on all levels.

Consider data hubs as a concept to decrease data silos and get better control

Point to point data integration

The majority of hundreds (thousands) data flows in companies nowadays are point to point solutions that indeed support all required data flows are fully integrated and operationally successful.

Though a point-to-point approach can easily handle smaller number connections, it proves fatal when the number increases.

Data hubs are a better solution than P2P integrations for many use cases.

Hub and spoke data integration

Hubs as such are not a new concept and are first used in air transport to make travel more efficient. TechTarget published article Managing Data in the Data Hub, which was in 2019 recycled by Gartner in Data Hubs, Lakes and Warehouses: Choosing the Core of Your Data and Analytics Platform, which defined the following five major Data Hub types: MDM data hub, Application data hub, Integration data hub, Reference data hub, Analytics data hub.

Number of integration links in point to point integration vs hub and spoke integration

The number of P2P data pipes are increasing exponentially with the number of data endpoints and are larger than the Hub concept even from 3 endpoints.

Basic idea is to reduce the number of point-to-point connections with the central hub, where each data endpoint is connected with it only once. With this we convert an exponential problem into a linear one, decreasing the number of data flows significantly.

Other benefits of Data Hubs:

  • has better data visibility
  • has centralized control
  • is based on modern semantics
  • has lower latency and better performance
  • is agile and cost-effective
  • is a repeatable and scalable solution
  • simplify technology and landscape

We are available for further discussions on this topic offline.

Previous & Next

Data Integration is not a stand alone project.

Data Integration Is Not a Stand-Alone Project

For a variety of reasons, data integration is not a stand-alone project. It is dependent on and includes other aspects of modern data management such as data qu...

read more

Poor Data Management Will Get You In Trouble

Data is the modern capital. Therefore, you should treat it accordingly. But – do you? I am sure no company treats its data as well as they treat their money. ...

read more

Ready for the next step?

Our team of experts is here to answer your questions and discuss how we can boost your operational efficiency by merging rich tradition with a progressive mindset.