State of Data 2025

The Tool Phase of AI Is Coming.

Industries Are Not Prepared.

Methodology: The Data Innovation Journey™

Hakkoda uses the Data Innovation Journey model to assess businesses across key five transformational capabilities and four stages of maturity:

Chaos, Order, Insight, and Innovation.

A Chaos Organization is still at the beginning of its data journey and has yet to identify and harness the strategies, tools, and services that will optimize its data and technology stack. In contrast, an Innovation Organization is well into its data journey and has moved past the fundamentals of centralizing and standardizing its data, instead focusing on initiatives that allow them to leverage advanced capabilities like AI automation and data monetization.

A survey of 500 data leaders from organizations with 1,000 or more employees across eight major industries classified participants across these four unique stages of data maturity to identify key trends among high and low performers. These trends were then analyzed and distilled into the State of Data report, which identifies focus areas, use cases, predictions, and opportunities for the coming year.

This report also draws upon Clay Christensen’s disruption theory, which posits that new innovation may initially seem less impactful, much like a toy. New innovation may stay in this “toy” phase for some time before progressing to a “tool,” at which point it has the potential to significantly disrupt even established market forces.

Does your organization still need to migrate data from on-prem or legacy systems to a cloud platform?

Data leaders agree that the future is in the cloud.

The cloud migration rush has reached a fever pitch, with 99% of organizations indicating they will have deployed a cloud data store or warehouse by the end of 2025 and only 1% of organizations holding out with no plans to deploy. 

Large organizations are unanimous, meanwhile, in their acknowledgement that a centralized cloud is the future. Every data leader surveyed from organizations over 5,000 employees in size indicated they planned to deploy a centralized cloud platform—indicating that across industries businesses understand the cloud is key to scalable operations and revenue. 

That trend toward unanimity is especially strong among the most data mature enterprises. While 93% of top performers indicated that they were already using a cloud platform as of 2023, this number jumped to 100% when respondents were asked if they would be using a cloud platform by 2024’s end.

This Iceberg, too, is bigger than it appears.

With an increasing number of technology vendors and new capabilities for existing vendors, the need for flexible tooling selection reaches a crescendo. Data teams need to be able to choose the right tool for the right job, which likely will not come from one vendor. Enter the open source table format, Apache Iceberg.

Iceberg has quickly gained traction as a high-performance, vendor-agnostic table format, offering a powerful solution to the growing need for flexible, interoperable data architectures. Major players in the data space, including Snowflake and Databricks, are increasingly recognizing the value of Iceberg, acknowledging it as the large enterprise lake format of the future.

In 2025, Iceberg will become the lynchpin in the “bring-your-own-tool” craze, facilitating the boom of cross-cloud and multi-cloud architectures—and even “multi-compute” data strategies that leverage the strengths of multiple platforms simultaneously. Ultimately, this shift will generate huge cost savings and greater pliancy for organizations dealing with large volumes of data while opening a pathway to new architectural possibilities.

As more organizations adopt Iceberg, their use cases and innovations will continue to demonstrate that the future of data management will not be defined by any one proprietary platform, but by the flexibility of open standards that allow organizations to select the best tools for their needs and adapt to technological changes.

iceberg_image_group-1.png

Stragglers are running out of time to retire their legacy tech debts.

While others surge to new tools, 15% of organizations still indicate the intention to deploy technologies they deem “legacy” in 2025—a figure which illustrates how difficult businesses find it to move off of outmoded platforms altogether.

This continued reliance on obsolete technology is especially prevalent among the least data-mature organizations, where 26% project continuing to deploy tooling they already identify as legacy through this year and beyond. The costs of this inability to outgrow their existing data stacks will only continue to grow as innovators in their industries close ranks around modern, cloud-native technology stacks.

It’s not all doom and gloom for late-blooming enterprises, however. A significant number of these less data-mature organizations indicated 2025 would be their year to catch up by migrating to a primary cloud platform, monetize data, and seize the AI use cases their competitors deployed in 2024.

37% of “chaos” organizations reported plans to centralize on a primary cloud platform in 2025, while 47% of these same organizations reported the intention to monetize their data this year—as opposed to the 9% of “innovation” organizations who anticipated still needing to put their monetization strategies in place in 2025.

What are your organization’s plans for deploying legacy technologies in your data stack?

Organizations still needing to centralize data on a primary cloud platform in 2025 or later:

Plan to migrate to a central cloud platform in 2025 or later

Organizations still needing to monetize their data in 2025 or later

VPs and General Managers, in particular, see 2025 as the year to migrate to the cloud. This matters, because C-suite executives seem suspiciously bullish about their existing tech stacks.

Significant maturity differences also arise between industries. 41% of entertainment organizations indicated they plan to centralize data on a primary cloud platform in 2025 or later, making them the laggards across all industries. In contrast, only 20% of supply chain and logistics organizations planned on centralizing in 2025 or later, placing them front of the pack among cloud adopters.

Enterprises have been using AI as a toy…that’s about to change.

To date, enterprises have struggled to show the ROI of their AI initiatives. Teams either a) don’t do it or b) don’t do it very well. The reasons behind this are anything but surprising. Organizations are building multiple new muscles, using new tech, understanding how to forecast AI usage, and frankly, have no roadmap stretching either forward or backward for how to benchmark, measure, and validate their ROI. We haven’t exited this phase of our collective AI maturity, and until we do, AI remains a toy—something we’re testing and looking at with great excitement, but ultimately not a core piece of our businesses. In 2025, that’s all about to change. The AI as a tool phase is coming, and it will require highly specific, trained and monitored AI solutions, including models like RAG, or Agentic AI for unique industries and even roles. In 2025, look for a swarm of industry focused AI tools for sectors like law, healthcare clinical analytics, supply chain, manufacturing, and more.

Adoption rates show that we are poised for this shift. In 2024, 47% of organizations indicated they would use AI copilots, only 15% of data leaders indicated an intent to hold off adoption until 2025. LLM usage followed a similar pattern, with 42% of organizations utilizing at least one if not multiple LLMs across their operations in 2024 and a mere 12% waiting until 2025. For almost half of organizations across industries, AI adoption is already well underway, making data teams all the more prepared to start understanding their own baselines for value, use cases, and utilization.

Which technologies does your organization currently deploy or plan to deploy in the year ahead?

“I think of AI like an Ironman suit. That’s really the whole concept: humans doing human jobs, but doing them far more effectively and efficiently, and doing things that weren’t possible before AI.”

- Erik Duffield, CEO of Hakkoda

The transition to true utilization of AI as a tool won’t happen overnight, but within the year, a majority may well be poised to make their own leap, and the repercussions will be ground shaking. Be on the lookout for sudden industry shifts. Bold new competitors may emerge seemingly out of nowhere. Known entities may change their entire business model. And for all of us, the baseline for competition will rise. If you’re among those organizations that have elected to wait until 2025 to start integrating AI, time is already running out, and you’ll need to develop a plan to accelerate the jump to the AI tool phase.

As AI integrations pivot toward ROI, new urgency is mounting around specific use cases.

1.

Governance and
Compliance

The rapid integration of AI into tightly regulated fields like financial service and healthcare is already outstripping the rate at which governing bodies can regulate its use. As organizations get more serious about measuring the impact of their deployments, they must also be more vigilant than ever about evolving legal requirements.

Perhaps unsurprisingly, top performing enterprises are taking a proactive approach to this evolving regulatory landscape, leveraging—you guessed it—generative AI capabilities to streamline their data governance and compliance processes. 

96% of innovation stage organizations, in fact, indicate that they are already using AI for data governance and compliance as they move into 2025. Organizations that are not already leveraging the agility of AI tooling to keep on top of regulatory requirements, meanwhile, run the risk of falling behind as new privacy, security, and legal requirements emerge this year.

2.

Data Cleaning and
Preprocessing

Of course, as is well established by now, underlaying data quality will remain a sort of litmus test for determining the value of AI deployments. As companies ramp up their AI pursuits this year, they will be forced to recognize that their data is the limiting factor. 

Innovators across industries seem to anticipate this bottleneck, with only 6% of those enterprises indicating that they will still need to integrate AI into their data cleaning and preprocessing operations in 2025 and beyond. Compare that number with the 69% of Innovators who indicated they were already using AI for data cleaning and preprocessing at the beginning of 2024, and the 25% who endeavored to implement this use case this year.

Less data-mature organizations, meanwhile, continue to face delays in cleaning and preprocessing use cases, with only 28% reporting successful implementation at the beginning of 2024 and 22% still planning to close that gap in 2025.

3.

Workflow
Automation

The road to strong returns on investment is, of course, a branching one, with opening new revenue streams and delivering additional value on the one side and the reduction of costs and risk on the other. 

For many organizations, especially in and around the supply chain and logistics space, that second route is beginning to emerge as the quickest way to start tying AI implementations directly to their bottom line. 

Predictably, the most data mature organizations have already leaned in hard on using AI to automate cumbersome manual processes, with 94% of top performers reporting plans to have automated workflows in place by the end of 2024. 

While 21% of the least data mature organizations indicated they would still need to automate with AI in 2025 and beyond, 75% of these same companies plan to have those automations in place before then, signaling a strong trend in the direction of marketwide adoption.

“In 2025, the focus will be on automation, generative AI, and end-to-end visibility in supply chain analytics. Most enterprises in this space operate on thin margins, so automation enables companies to do more with less. Rather than focusing solely on generating revenue, the emphasis will be on reducing costs and improving operational efficiency.”

- Rico Mawcinitt, Global Head of Supply Chain and Logistics at Hakkoda

4.

Real-Time Fraud
Detection

Regulatory bodies aren’t the only external forces that businesses will need to keep up with if they want to remain competitive in 2025. Increasingly sophisticated and growingly ubiquitous cyber security threats will require enterprises to develop equally advanced fraud detection and security solutions—especially for companies trading in vast amounts of sensitive information. 

To that end, real-time, AI-driven fraud detection is quickly becoming not just an impressive capability touted by industry pioneers, but a competitive necessity that entire sectors will need to figure out fast if they are to navigate an unprecedented influx of fraudulent activities and other risks. 

AI’s unique capacity to analyze vast amounts of transactional data in a matter of milliseconds, coupled with the ability to quickly and accurately identify anomalies all but invisible to human agents, is positioned to fundamentally transform the cybersecurity space in the coming year. Enterprises that fail to strategically integrate this technology will risk regulatory penalties, costly data breaches, and worse.

Organizations that say applying AI to automate decision making is a major challenge for data management and operations

“Future-proof”
architectures are a thing of the past.

AI is shifting the technology landscape faster than data teams can build. That’s not a death knell for the in-house IT team, but it is a mandate to evolve how we build, plan, and roadmap. The five-year plan for your organization’s tech stack is dead. Let it be. The future is more exciting if, granted, unpredictable. Data teams will need to shift to 6-18 month strategy roadmaps, with frequent check-points along the way to ensure that some new technology, capability, competitor, or revenue opportunity doesn’t necessitate a pivot. 

While the idea of constantly writing and rewriting your organization’s data strategy may seem daunting, the good news is that this accelerated landscape will also offer your business opportunities to leapfrog modernization. If AI is the 4th wave, look for places where you can jump ahead. Have infrastructure that’s not completely modernized? You may not have to follow the grueling process of yesteryear, diligently moving your business from wave 2 to wave 3. AI will offer opportunities to shortcut and accelerate your organization’s modernization process. Smart data teams will look for these moments and seize them.

Though modernization is a priority for most, it may not look like a total reimagining of companies’ tooling. Instead, modernization in 2025 will focus on optimizing and centralizing critical technologies that are already in place, but not yet being utilized to their full potential.

Conclusion:
Keys to leading a successful data team & business in 2025.

The best data and AI experts are already advocating for a few critical best practices across industries. Direct from senior leaders, here’s the advice you need to know. 

However you start with your AI adoption, make sure it’s a domain aligned model. Bringing your AI tools as close as possible to the business will allow you to pivot quickly, realize value faster, and ultimately, prepare for a self-service future. All eyes are on Agentic AI. 

Look for AI ROI in the mundane. It’s the natural tendency of human beings to want to take things a bridge too far, especially if your organization is still in the “toy” phase of AI. But the truth of finding early success with your AI initiatives is that it’s the dull, repetitive, often disliked tasks that are easiest to automate. Have 25 people at your organization whose sole job is scraping data from specific websites? That’s the place to implement AI. Your millions hidden under the couch cushions is there, but you won’t find it if you’re focused on adopting technology that’s not mature or that your team can’t fully support. One returns to that old adage so often used to caution the tech industry: Early and wrong look the same. 

Adopt a RaaS mindset. Results as a Service isn’t just an emerging business model, it’s also a way of thinking about your own data strategy. Results as a Service (RaaS) focuses on delivering specific outcomes through automated processes and AI agents. Why does this matter? Well, the modern data team may not be a cost center. Rather than a necessary expenditure that the business absorbs, data teams are poised to become a center for profit in many industries. Even if that feels a long way off for your organization, adopting a RaaS mindset can start to prepare your department to service and deliver impact across the business. 

Consolidate your most valuable talent. Design thinking and automation experts are becoming extremely desired skillsets. And of course, AI represents immense opportunity, but there’s nothing worse than automating a process that shouldn’t exist in the first place. Resources who can rapidly understand your current state, articulate the art of the possible with new tooling, and connect those opportunities back to pragmatic business outcomes…those people are the new 10x engineers. If you have team members with that talent, hold on to them. If you don’t, it’s time to start looking. 

However you choose to lead, grow, and strategize, one thing is clear: 2025 is set to be another year of transformation for our industry.

Facebook
Twitter
LinkedIn