Data Management In 2026 – Thanks to AI, Data Fitness is Here to Stay

Blog categories: Pentaho Platform

GenAI’s release of ChatGPT in late 2022 kicked off an acceleration like the data market has never seen. Consider this: there have been over 30 significant updates to general purpose LLMs in 2025 alone!

However, this year did see a fundamental and more permanent mindset shift, and that’s related to data management – specifically data foundations. As organizations looked to move AI from experimentation to operations, they ran headlong into core data challenges that have been festering for years. Quality, governance, and accessibility issues, especially with unstructured data, have kept leaders from deploying AI at scale.

This has driven a real and sustainable interest in data fitness, the disciplined ability to discover, structure, govern, and mobilize data across hybrid environments. Enterprises that treat data as a strategic asset (with unified catalogs, lineage, and policy) are unlocking reliable AI outcomes faster; those that didn’t saw stalled pilots, spiraling storage costs, and automation built on sand.

In thinking about the past year, I see a few key trends that leaders need to consider if they’re going to realize their AI goals in 2026.

Shift From “AI First” To “Data Intelligent First”
AI does not fix data problems – it magnifies them. This year exposed the gap between AI tooling and the reality of siloed, opaque enterprise data. Hybrid architectures (cloud + on prem), unstructured repositories (email, PDFs, media), and fragmented governance create blind spots that skew insights and decisions. The remedy is not another model; it’s trustworthy data design: consistent schemas, metadata and context, stewardship, and policies that span environments.

Eliminate ROT And Build Data Ecosystems for Agentic Workflows
The industry’s “store everything” habit has turned into a budgetary crisis and a strategic risk. Redundant, obsolete, and trivial data (ROT) inflates storage bills while degrading signals for AI systems. Organizations are responding by auditing what they have, classifying what they need, and automating lifecycle management so data stays purposeful. In parallel, as autonomous agents enter workflows, leaders are learning the hard way that agents require structured, traceable inputs with clear lineage and governance – or else speed simply multiplies mistakes.

How Leaders Can Win In 2026 And Beyond
Across our work and writing this year, we stressed four commitments through which leaders can carry AI forward with real impact.

Unify the view of data. Build a searchable catalog that spans cloud and on-prem sources, with structured and unstructured data, so teams can discover what exists before they build.
Elevate governance. Treat policies, lineage, and access controls as first-class features of AI programs – not compliance afterthoughts.
Ruthlessly reduce ROT. Audit, classify, and retire redundant or obsolete assets to improve trust and cut cost; shift from “keep everything” to “keep what creates value.”
Design for agents. Expect agentic systems to act autonomously; ensure data is contextualized and traceable, so decisions are defensible.

Leaders who adopt these practices can move from AI trials to AI impact. They transform hybrid complexity into coherent, governed data flows that reliably power analytics, automation, and intelligent agents. If you’re interested in diving deeper into these topics, please check out these related columns in Forbes. Thanks for reading, and here’s to a more data fit 2026!