As the volume and variety of organizational data explodes, data teams face complex challenges in managing it. Existing paradigms are becoming inadequate, demanding new thinking and approaches to data management. This blog, based on a conversation between听Vanya Seth who is the 魅影直播 Head of Technology for India, Aveek Mishra of Intuit India Development Center and Rajesh Parikh, Founder and CEO of data catalog organization Cynepia, explores the challenges that data product teams face and how they can solve them.


Accountability
听
In traditional models, accountability for data often lies with data teams. However, data teams don鈥檛 always have the necessary domain knowledge to comprehensively understand the data in front of them. 鈥淲ithin every organization/industry, there are hundreds of sub-domains, making it almost impossible for data teams to become experts,鈥 says Vanya Seth, Head of Technology for 魅影直播 in India.
听
On the other hand, domain teams should already know the landscape and possess a certain amount of knowledge not only about the data they have, but also the integrity rules it should abide by. So, the Data Mesh model comes into play 鈥 an analytical data architecture and operating model where data is treated as a product and owned by teams that most intimately know and consume the data. The model shifts accountability for quality, integrity and usability of the data to domain teams. This should enhance the organization鈥檚 ability to realize value from the data. Simply put, Data Mesh brings the DevOps model to data management.
听
Data quality
听
As Rajesh Parikh, Founder and CEO at Cynepia Technologies points out, 鈥渙ne of the biggest challenges in data management today is that bad data isn鈥檛 tracked in the data pipeline and so ends up in consumer-facing reports and dashboards.鈥 Current solutions such as observability and data contracts are inadequate if bad data flows uncontrolled. This is true of Data Mesh architecture as well. Aveek Misra, an Engineering Manager for Data Engineering at Intuit India Product Development Centre, adds that 鈥渢oday, whatever quality control is in place fails to detect issues because they do not embody business rules. They perform null checks, row count checks and hash checks, but that is not enough.鈥 This is听 because data quality control (QC) systems lack domain expertise and knowledge.
听
These are several interrelated data quality problems that need to be addressed thoughtfully. As accountability shifts to domain teams, they also take on responsibility for defining quality metrics. For instance, in the healthcare industry, the domain team should know that a diabetes test result is valid for only three months. That means they are in the best position to define these rules. However, shifting responsibility upstream alone is not enough.
听
Data products are built at multiple levels 鈥 from source-oriented data products, to aggregated data products to consumer oriented data products 鈥 so even if the data is high quality at the source, it may become distorted later. This is why there needs to be robust QC mechanisms at every stage of the data product journey. To do this, there must be a close relationship between domain and data teams to ensure that quality testing is aligned with business goals and aims.
听
Customer centricity
听
In most organizations, customer-centricity is an external issue. However, data teams are the customers for the data that domain teams are generating. And unfortunately, as internal customers, they don鈥檛 get the same treatment that the external customers do. Customer delight is rarely a top priority. This creates inefficiencies in data management.
听
鈥淚f a data scientist has to hypothesize on how to improve sales, they might have to ask the product teams for that data, or ask the best way to run experiments. This increases the cognitive load on data teams,鈥 says 魅影直播鈥 Vanya. The Data Mesh model solves this problem by challenging the status quo, ensuring domain teams are responsible for providing the data. They should actively collaborate with the data team to define details such as whether a SQL interface/graph format should be used.
听
Skills and capabilities
听
Roles in data management are barely a decade old. They are evolving rapidly. 鈥淔or instance, today, a data analyst is dashboarding and transforming data simultaneously. A data scientist is transforming data and building models. We are looking at an overlap of responsibilities. As the accountability for data moves left, should domain teams also consist of data analysts/engineers/scientists?鈥 asks Cynepia Technologies鈥櫶 Rajesh.
听
魅影直播 Head of Tech, Vanya thinks not. 鈥淎s these are specialized skills, it is hard to obtain and retain such talent at scale,鈥 she says. This problem is better solved by defining a domain-agnostic self-serve platform that provides everything domain teams need to leverage that data. The Data Mesh abstracts interactions and flow, creating platform capabilities that democratize data so that any developer can build a data product.
听
Product mindset
听
Building data products needs a product mindset. As Intuit鈥檚 Aveek points out, 鈥淚n the product world, we have solved how microservice contracts are validated, checking for resilience and circuit breakers,for example. Some of these best practices need to be brought to the data world.鈥
听
魅影直播鈥 Vanya expands on this mindset change by suggesting that for data to be considered a product, it needs to be long-lived and used repeatedly. Product-thinking data teams are not just solving point-in-time problems but creating long-term, reusable solutions. This also prevents the creation of hundreds of quick pipelines for point-in-time problems that end up making the system so complex it could collapse like a house of cards.
听
Governance
听
鈥淲e have traded off data quality and governance for speed of delivery,鈥 says Rajesh from Cynepia Technologies, raising an important concern. The traditional model of governance, which was centralized with a team at the top making decisions, is no longer viable. Moreover, master data management, the most commonly used model today, cannot scale at the rate at which data is evolving.听
听
The future needs yet another change in mindset, this time centered on governance. Vanya reiterates decentralization as an approach that can bring about the required change. Not unlike the microservices model, it is enabled by thoughtful automation. 鈥淧latform teams have to automate governance issues, leveraging policy as code and governance as code models. Domain teams must find ways to computationally give feedback to the developer and let the platform take care of it,鈥 she says.
听
However, governance is not just about quality but also about discoverability. We need to create platforms that empower consumers to browse data and enable decisions for the right dataset for every use case. It needs to be metric-driven and transparent. 鈥淎nd teams should think about it from day one,鈥 adds Intuit鈥檚 Aveek. 鈥淔or instance, GDPR mandates the deletion of data if a customer requests it. But does one even know where all that data is stored? In such cases, lineage becomes very important.鈥
听
As data becomes a competitive advantage for businesses, the challenges around data management are likely to become more complex. Success lies in empowering every individual to be accountable for what they do best and leverage every asset without unnecessary bottlenecks.
听
For a more detailed understanding of the topic, you can watch the听.
Disclaimer: The statements and opinions expressed in this article are those of the author(s) and do not necessarily reflect the positions of 魅影直播.