Improvising Data Mesh Patterns Using Data Fabric Architecture
Data professionals are often forced to collaborate with teams from multiple disciplines. In an endeavour to provide the right data to the right customers and in the time, they are left entangled in an error-prone and time-consuming process. And that happens largely due to the centralized structure of most data architectures including fabrics.
With the arrival of data mesh, data scientists and engineers have a breather as their workload is distributed. Unlike traditional practices, the mesh decentralizes the ownership of data sets known as data products to every business domain. This means only selective components are reserved for centralized decision-making. Here, the data is distributed to a wide range of physical sites as per the proximity of every business entity.
However, that hasn’t fully convinced everyone to detach from centralized architecture. After all, it is simpler to set up and less costly to maintain. Despite its ardent need, there are certain challenges in its adoption.
Data Mesh Challenges
Managing data assets from multiple disparate systems is a complicated ecosystem. It causes dependency upon different domain entities. Here’s more.
Data duplicity in multiple domains
When the data of one business domain is remodelled to fit in another, it causes redundancy thereby attracting faults and higher management costs.
Ensuring exclusivity of every data domain
Business domains vary from each other at multiple levels. They could have different governance policies, quality, data volume and pipelining requirements. However, in some scenarios, data products and their respective pipelines could be shared across domains. At this stage, it is essential to ensure exclusivity without impacting each other.
Migrating from centralized data management to a more liberated and decentralized mesh architecture requires careful execution. There’s a need for strategically implemented change management practices. Existing data and analytics tools should be adapted and augmented to support data mesh architecture.
Cost and risk
Embracing decentralization also attracts additional costs and risks. It requires an end-to-end makeover of integration, preparation, virtualization, governance, masking, orchestration and cataloguing processes. The exercise could be time-consuming, risky and uncertain in cost.
Implementing Data Mesh with an Entity-Based Data Fabric
Despite repetitive comparisons between mesh and fabric, there’s an exciting use case for implementing both. Fusing the concepts of domain-specific business entities and data as a product, fabric could resolve the problems in the mesh architecture.
The data fabric connects all disparate data sources by creating an integration layer. Subsequently, it is able to deliver a holistic and real-time view of the analytical and operational workloads for every entity.
In parallel to the mesh architecture’s distributed pattern, the fabric centralizes the semantic definition of all the products. Furthermore, it establishes data ingestion methods and centralizes governance policies. This helps in securing data stored within the products in compliance with the predefined regulations for the entire landscape.
While we are at it, K2View’s fabric solution is an apt case study. Popular for storing business entity data in multiple micro databases, K2View has an interesting approach to mesh. It captures data from all sources and integrates them into multiple data products as per the requirement. This ensures seamless distribution to any number of domains.
It provides liberty of data ownership to all the domains while centralizing cataloging, governance and governance.
The same is applicable for both types of workloads – analytical and operational.
The K2View fabric works because it generates a single point of a holistic view of all business entities without impacting their individual independence.
Amidst the growing debate of data fabric vs data mesh, K2View’s infrastructure as a platform provides multi-dimensional abstraction and automation.
Fabric Resolving the Issues of Mesh
Data pipelining for every domain requires expertise to build a distributed system that connects with multiple, disparate source systems across the enterprise landscape. In complex integration scenarios, this is a greater challenge.
Fabrics can help in building exclusive data products for every business entity in the virtual layer. This eliminates the need for domains to deal with the underlying systems. Ultimately, it helps in automated and skilful integration.
Despite domain-level independence for all business entities, central teams still have a role in the mesh architecture. Building this balance between decentralized and centralized entities is a challenge and requires cross-level collaboration.
With fabrics, the domain-level teams can engage with centralised authorities to write APIs for their exclusive data consumers thereby preserving governance, controlling access rights and monitoring the usage. This way, the data products would perform in compliance with the centralized as well as decentralized entities.
Data products have to be consistently fed to consumers in a variety of environments such as online or offline, on a single platform. Therefore, ensuring the secure delivery of batch data in real time is a challenge. With an entity-based data fabric solution, capturing and processing large volumes of data from disparate systems gets fast and efficient. It provisions on-demand delivery of data products for multiple use cases whether operational or analytical.
As the IT landscape rapidly migrates to the cloud, the rate of data production and consumption shall surpass all peak points. Needless to say, this will intensify the contest among data professionals to deliver the best insights for their users. They should detach from primitive strategies and look beyond comparing mesh and fabric. A powerful possibility lies in including both.