Discover how strong data quality fundamentals drive AI and GenAI success by ensuring accuracy, completeness, and consistency through end-to-end data management.
Per the Oxford English Dictionary, quality is defined as “the standard of something as measured against other things of a similar kind; the degree of excellence of something.”
Data quality is both a quantitative and qualitative measure of its excellence. Together, they provide real insight into the value of data. Quantitative measures, typically driven by statistical insights, are easier to measure, can be interpreted readily, and provide a level of clarity on the suitability of data.
Qualitative measures, when applied to data or information, typically are information that is subjective and open to interpretation. I like to consider qualitative as ‘in context of’ or ‘in reference to’ when applied to data quality.
When breaking down data quality, the most common framework is quality dimensions. Quality dimensions mix quantitative and qualitative evaluation models that can be measured in isolation but are most useful and powerful when they are brought together. Consider completeness, uniqueness, and consistency as a starting point for quantitative dimensions.
All of these lack external references so by themselves do not inform the appropriateness of data for a given use. This is where additional qualitative insights are needed, including accuracy, timeliness, and correctness (or validity). Timeliness provides details on data’s age. Correctness ensures that, for instance, a phone number provided for an individual in the US is indeed a valid US phone number with 10 digits. Continuing with this example, accuracy determines if the phone number given for an individual is their actual phone number. These are crucial elements that inform policy design and application that feed data quality scores.
It becomes very clear very quickly that without context, data quality efforts will fall far short of what organizations need, not only for core operations but also for AI and GenAI. This context, in many cases, relates to unstructured data, so crucial for AI and GenAI, and which we know most organizations struggle to organize, classify, analyze, and activate.
The potential gaps in this one small example are writ large when you consider a mid or large enterprise with hundreds of thousands of customer records. This is why hospitals, banks, or commercial enterprises of any size struggles with data quality when not using an end-to-end approach that leverages automation to apply policies, lineage, traceability, and quality across its data estate.
Pentaho considers and accounts for all of the above in our platform. It’s why we’re so focused on the relationships between data, the importance of accurately classifying data at the source, and the importance of carrying metadata properties throughout the lifespan of data.
In the next blog post, we’ll explore how these fundamentals impact the considerations teams must allow for to have a strong and scalable data quality strategy, how data quality is shifting in an AI world, and what data quality means when getting ‘data fit’ for an AI world.
Author
View All Articles
Featured
Simplifying Complex Data Workloads for Core Operations and...
Creating Data Operational Excellence: Combining Services + Technology...
Top Authors
Christopher Keller
Maggie Laird
Joshua Wick
Steve Donovan
Rishu Shrivastava
Categories
Grupo EULEN uses the Pentaho+ Platform to boost agility, streamline data workflows, track metrics, and drive faster, smarter decisions.
Learn More
Mid-tier banks face unique challenges in data modernization, governance, and compliance due to budget and resource constraints, requiring tailored strategies to meet growing regulatory and AI demands.
Considering evolving regulations, data quality will always remain at the core of BFSI resilience and competitive advantage. BFSI organizations that invest in data quality will be able to join the world’s standards, stay on-side, and scale.
While DORA is a looming regulatory burden, it presents a real opportunity for smaller and mid-sized banks.
Pentaho President Maggie Laird on What’s New and What’s Next