Pentaho 11 is here. See what’s new in our most advanced release yet. Read the blog →
Scalable by design:
Products
Solutions
Industries
Learn and grow:
Resource Hub
Dive Deep
Support
Unpack why data fitness has become a prerequisite for AI success and how organizations can take practical steps to get there.
AI promises speed, intelligence, and competitive advantage. Yet for many organizations, the path to AI adoption is blocked by a problem that rarely makes the headlines – data debt.
As enterprises rush to operationalize AI, they are discovering that years of unmanaged growth, duplicated datasets, unclear ownership, and poor data quality have quietly accumulated into a tax on progress. The result is stalled AI initiatives, rising cloud costs, and growing risk.
In our latest webinar, Getting Data Fit: From Data Debt to AI ‑Ready, Steven Catanzano, Senior Analyst at Omdia, joins Michael Donahue, Global Field CTO at Pentaho, to unpack why data fitness has become a prerequisite for AI success and how organizations can take practical steps to get there.
This 19-minute session shares a clear and powerful message for any data leader: AI cannot outrun the condition of the data beneath it.
While AI investment accelerates, so do data challenges. According to Omdia research shared in the webinar, 61 percent of organizations report that AI initiatives are actively creating new big data problems that must be addressed.
This is the AI paradox. Models and analytics continue to grow more sophisticated, while the underlying data environment becomes harder to manage. Every day, there are more pipelines, more storage, more copies, and more uncertainty around what data should be trusted or retained.
Without intervention, AI only makes the problem bigger and creates more risk.
A core contributor to data debt is redundant, obsolete, and trivial (ROT) data. ROT quietly consumes infrastructure, complicates governance, and dilutes data quality signals.
Omdia estimates that between 60 and 85 percent of enterprise data is ROT. This data incurs heavy storage, backup, security, and compliance costs while increasing the surface area for risk.
As Michael Donahue explains in the session, it’s not just about visibility. It’s also about decisions – the inability to reliably answer which data has value, who owns it, or where it should live.
Until those questions are addressed, AI initiatives inherit uncertainty by default.
Being data fit does not mean keeping everything or deleting aggressively. It means understanding your data well enough to make confident, repeatable decisions about its lifecycle.
In the webinar, Steven and Michael outline Pentaho’s five-step data fitness framework, which provides a practical model for moving from data chaos to AI‑ ready‑ assets:
This framework emphasizes progress over perfection, providing structure without slowing the business down.
Data fitness is often framed as a governance or compliance effort. The webinar makes clear that it is also a financial and operational opportunity.
One case study highlighted in the session shows a global organization achieving a 70 percent reduction in storage costs within just 10 weeks following an initial ROT analysis. They expect a £1.68 million in annual savings, unlocked by identifying low ‑value data and aligning it to the right storage tiers or retirement paths.
Just as importantly, the organization gained clarity and confidence in the data it is choosing to retain. That confidence is what makes AI initiatives scalable instead of fragile.
Pentaho’s approach brings together data optimization, data cataloging, and data quality to create a continuous feedback loop. Usage informs value. Quality informs trust. Policy informs action. Monitoring keeps everything aligned as data changes.
This integrated model reduces manual effort while improving accountability and transparency across teams.
If you want to go deeper, the full ebook Getting Data Fit for AI: How Pentaho Intelligent Data Optimization Cuts Costs and Accelerates AI Readiness expands on these concepts. It explores the data fitness framework in detail, outlines real-world optimization strategies, and shows how organizations are reducing data spend while preparing for AI at scale.
Author
View All Articles
Featured
Simplifying Complex Data Workloads for Core Operations and...
Creating Data Operational Excellence: Combining Services + Technology...
Top Authors
Dr. Pragyansmita Nayak
Jessica Allen
Mauro Damo
Tim Tilson
Sandeep Prakash
Categories
Most organizations understand technical debt, but fewer recognize data debt.
Learn More
Snowflake powers analytics at scale, but it won’t clean up zombie tables, stale datasets, or dark data that inflate costs and compliance risk. Pentaho Data Optimizer automates lifecycle management, enforces governance, and reduces spend — without breaking your dashboards.
Increase Innovation Investment Through Smarter Data and Storage Management