Data Quality: What it is and how technology can improve it

Data Quality: What it is and how technology can improve it


Data Quality: What it is and how technology can improve it

Data is a burning topic in the business world these days and one of the most valuable assets a financial organisation has access to. However, to be useful, data need to be of high quality, otherwise it can have a severe impact on the decision-making process, and ultimately, on revenues.

Poor data quality has a significant cost of $12.9 million on average per year according to Gartner. Creditors have started to realise that data is a gold mine and that is the reason why 7 out of 10 companies will track their data via metrics by 2022 aiming to improve it by 60% in order to recuse their operational costs and risks involved.

What data quality is

Before we dig deeper, it will be useful to state a few key points about Data quality. Data quality includes the measurement of the condition of the data to ensure it fits to serve the purpose it will be used for. High data quality means high relevance of the data to give answers to specific needs of an organisation. Examples of data quality issues are incomplete data, incorrect data, poorly organised data etc.

Data Quality Misconceptions

Moving forward, we shed light on the difference between data quality, data standardisation, information content and information value, as they often tend to get mixed up.

To get things straight, data standardisation is not a technological problem at all. It is emerging on its own as the need to exchange data with multiple actors becomes more pressing. Even in the notoriously unstandardised NPE servicing space, data interchange standards are emerging, either de facto, as certain platforms dominate the market or by fiat, for example through templates published by regulators.

There is, however, an interesting conflict between standardising formats the traditional way by providing templates or schemas to which everybody conforms and standardising on mechanisms for handling variability like the ability to describe data models in machine-readable form so disparate data sources can be consumed. Some domains have chosen the first approach (e.g. EBA NPE templates) and some the second (e.g. CRD4 and ESEF regulatory reporting). It will be interesting to see which approach dominates in the end and for what reasons.

In terms of information content, technology can help to capture additional information, integrate it and correlate with other available data to unlock hidden potential. This is where technological advances like open banking, standardised social media APIs and state digitisation initiatives can make a difference.

Note that there exists an exception to all the above. Data quality and availability are sometimes intentionally degraded to standardise data flows, exert control over information flow or reduce privacy concerns. In this case, creditors often demand a lot of effort to recover or workaround information that exists but has not been provided. However, the issue will keep fading in the near future since the data ecosystem keeps applying pressure on all concerned and all actors concerned recognise that there is more to be gained by data sharing than by data hoarding.

When it comes to information value viewed by the prism of Return on Investment (ROI), much hype exists about the role of improving data quality and providing additional, non-traditional information, but the truth is that the value of such initiatives is highly context-dependent and impossible to estimate without first running a trial with actual data. It may sound exciting to say that Facebook likes are predictive of a customer's behaviour, and it may even be true, but if the trial shows that they are only a tenth as predictive as traditional past payment behaviour, then it would only make sense to consider them if the traditional information is unavailable.

How is technology improving Data Quality?

In the past, assessing initiatives that aimed to improve data quality was a costly and high-risk endeavour. Financial institutions might have spent a lot of time and effort only to realise at the end of the attempt that they have nothing to show for it. But the techniques and, most importantly, the automation now available can bring together all the candidate information at little cost for processing and evaluation, and quickly sort through what is relevant and what is not. And they can repeat this process as the world changes, automatically keeping up with shifts in customer behaviour and data availability. This is where technology has a lot to offer.

All in all, technology can do quite a lot to improve data quality, but it cannot generate information that isn't there. It also can't increase the value of useless information. Its greatest point is not in being clever, but in being consistent, automated and focusing on reducing risk, turnaround time and human effort required to manage the ocean of data flowing around us.