The Modern Data Trust Challenges

June 30, 2022

1. Data Trust Challenges

Data trust challenges are more common than we think. They are noticeable in all industries and are both technical and cultural in nature. That means there is a divide between the data user and the data owner, and their perception of data trust.

Today we gather data from so many different sources it’s almost impossible to keep up with.  According to IDC, the amount of data created, captured, and replicated annually across the world will grow more than five-fold to 175 Zettabytes by 2025.  But getting data isn’t enough, and can present a risk unless you can trust it.

Results from the PwC study from 2020 show that user confidence in the data decreased by almost 30% in just two years, although companies think that customer confidence in the data increased by 50% over the same period.

The difference in perception of data trust according to data users and IT.

What’s more, research also shows that 60% of business executives don’t always trust their company’s data. That means most decisions aren’t data-based, because how can a decision-maker who doesn’t trust the data trust their own decisions?

That’s why companies need a data trust strategy, so they can keep up with all the information and also have an efficient business. But how do we address the complex challenge of having the right data, fully trusting it, and also make sure consumers trust it as well?

First, let’s look into what it means to trust your data.


2. Do You Really Trust Your Data?

Business users make a lot of strategic, tactical and operational decisions about markets, budgets, customers, and products daily.
However, these decisions are only as good as the data used to make them. A study from KPMG shows that only 35% of executives have a high level of trust in their organization’s data and that over time, the distrust of company data increases.

Having data that is healthy and ready to act on is the pinnacle of data trust. Data like that ensures employees can provide exceptional customer experiences, improve operations, ensure compliance, and drive innovation.

Every data team’s needs will be different. Teams must make their assessments of the metrics that trusted data should meet. For example, finance teams require an exceptionally high level of accuracy, while other departments may place a premium on timeliness instead. Data quality as part of data management is only one but very important metric of the data trust score, but it can give you extensive insight into your data.


3. Five Metrics of Data Trust Score

Data Trust Score is a single number, which represents the level of confidence in the data. It is combined from different metrics. The Talend Trust Score™ for instance aggregates five metrics into a single and easy-to-understand score, that scales from 0 to 5.
These metrics are:

1. Usage

shows how much your dataset is used as a source for pipelines or preparations.

2. Completeness

measures the number of empty records in your sample.

3. Popularity

measures the reliability of the dataset, based on user ratings and certification level.

4. Validity

measures the quality of the dataset itself, with the number of valid and invalid values across the dataset sample, as well as the use of semantic types.

5. Discoverability

reflects how well-documented your dataset is, with the use of proper metadata such as description, tags, custom attributes, as well as the presence of an API.

Data that performs well in one dimension can’t necessarily be 100% trusted. As shown above, you might have information that’s valid but not accurate, or accurate but incomplete. It could also be high-quality, but inaccessible. 


4. How Can You Get Better Data?

Improving the health of data used by a company should follow a core set of principles, which we refer to as: “the five Ts of trust.” Your data must be thorough, transparent, timely, traceable, and tested. The new Talend Trust Score, designed to help users dynamically rate data sets based on these criteria, could fundamentally change the way we use all available data and insights to make decisions.

Data Trust Score helps you make better decisions on higher quality data. This is made possible by the new Data Inventory application that:

  • Automatically and systematically assesses underlying data to ensure the use of more meaningful and trustworthy data
  • Provides a Trust Score that combines proprietary Talend technology for assessment of data quality with crowdsourced metrics, such as user ratings and times shared
  • Uses advanced search technologies to deliver immediate access to the right data for anyone

Talend Trust Score automatically indexes datasets with a crawler to provide a complete picture of data health before they start using it. Data can be assessed in multi-cloud, on-prem, and in hybrid environments. The Trust Score exposes metrics used to calculate data trust. Metrics include validity, popularity, completeness, discoverability, and usage. Talend Data Fabric also automatically resolves data problems by recommending the appropriate transformations using semantic types.

Don’t miss out on the chance to evaluate your data and gain insight into how trustworthy your data really is and get your free data trust score!


5. Resources

KPMG: Guardians of trust

IDC: The Digitization of the World from Edge to Core

OECD: ICT Access and Usage by Businesses

PWC: In data we trust: living up to the credo of the 21st century

Talend: DH_Survey_report