In 2017, The Economist described data as the oil of the 21st century. It’s a great metaphor because both create value in the right conditions. Oil adds value when it is extracted, processed, and transported to a customer. Data is similar. Data sitting quietly in a database is nothing but potential value. The way companies move and extract data is not working well, however. Perhaps a more accurate analogy is data as the new plutonium – capable of immense power when properly harnessed, but toxic when misused or in the wrong hands.
Sluggish and Expensive: The State of Modern Data
Traditional data methods like ETL (Extract, Transform, and Load) worked well in the low data volume world of the past. Twenty or thirty years ago, the volume of new data created every day was a fraction of today. In 2010, two zettabytes of data was created and used worldwide, but that figure rose to 64 zettabytes by 2020, according to Statista. The sheer volume of data is just part of the story – increasing appetites from consumers of data also drive the story..
Consumers and business users have an increasing hunger for data. Eighty percent of consumers want personalization from retailers, according to a survey reported by McKinsey & Co. Delivering on the personalization experience also translates to increased revenue. McKinsey found that grocery companies that personalize at scale can increase total revenue by 1-2% while reducing sales and marketing costs by 10% or more – huge impact in a low-margin, high-volume business. The gains in higher-margin industries may be even more impactful.
Delivering on that expectation will require the capability to access data quickly and cheaply. Gartner points out that data analytics is fast becoming a core business function. Further, high-quality data is required to power increasingly sophisticated machine learning applications. It will be nearly impossible to achieve your data automation goals when there are bottlenecks in your data processes and the way data is cleaned and prepared.
The way we’re managing data today used to make sense. The world has changed. To stay competitive and gain an edge, you need to boost data automation. A few years ago, launching data automation was difficult. Organizations had to take on the cost, ownership and administration of the function. You had to recruit a data team of expensive, hard to find data scientists. Once the team was in place, custom software development was also needed to make progress. Those formidable barriers to entry in data are also starting to fade away thanks to new tools.
The New Data Automation Strategy
The new way to look at data requires a new mindset. In the past, much data strategy was focused on databases with logical order. The structured nature of databases made the first wave of data automation possible. Without databases, creating complex financial reports would require vast armies of DBA’s and administrative personnel.
At this stage, most companies have nearly exhausted the potential of traditional databases and business intelligence tools. The next data opportunity will require new tools and a different mindset: data pipelines. So what are these three key ingredients to modernization?
Ingredient 1: Build A Data Pipeline
What Is A Data Pipeline?
A data pipeline is a series of automated actions that gather data from multiple sources and deliver it to the user at the specific point and time where it adds value
Unlike a traditional database, a data pipeline emphasizes constant movement and multiple sources. A data pipeline may combine existing data from your organization and leverage outside data sources. For example, a data pipeline at a property insurance company might combine an internal actuarial database with external weather pattern data to refine insurance pricing.
A data pipeline is a way to give analysts and customers access to real-time insight rather than waiting for reports. Since 2020, we’ve seen an increasing number of supply chain delays and disruptions. Walmart and Home Depot are pursuing highly creative solutions to these challenges by chartering their own cargo ships, according to NBC News. This solution carries a significant price tag of as much as $40,000 per day for a vessel. Before incurring those expenses, companies would be better served to have a data-driven cost benefit analysis..
A data pipeline that connects inventory levels and sales data can play a role in guiding supply chain decisions. Once the pipelines are in place, you may face a new problem. A massive increase in data volume creates the opportunity to make thousands of decisions to optimize your business. That’s where machine learning operations add value to your data.
Ingredient 2: Machine Learning Operations (ML Ops)
Machine learning operations is the deployment of machine learning models by DevOps teams. It means bringing machine learning out of the lab and using it to manage parts of your business. A couple examples:
- Machine Learning In Banking.
An Australian financial services company is now using machine learning to improve its fraud detection. Specifically, the company was able to create new models to significantly improve fraud detection. Improving fraud detection means fewer fraud costs and reduced reputational risk.
- Applying Machine Learning In Regulated Industries.
Developing new technology and treatments in the health field is challenging. Legacy technology holds back innovation, and meeting regulatory compliance requirements is expensive. Oravizio, a medical device software product, is using machine learning operations to enhance its product while meeting strict European Union regulations.
Ingredient 3: Move Data To Where It Creates Value
Trapping data behind walls and hard-to-reach places puts a hard limit on the value you can create from your data. The solution is to emphasize moving data to where it will create the most value.
As a starting point, look at the needs of internal users and external users. For internal users, it is crucial to look beyond the usual suspects of data users (e.g., BI experts and analysts). Using a no-code app like Unqork is an excellent way to empower less technical employees with data. Next, move data to external users, including customers and consultants. To create the most value, aim for real-time or near real-time data to end-users.
The Path To Data Automation Starts Here
Your first step on to data automation is choosing the right data platform like Snowflake. The second step is to get support, consulting, and training to achieve data automation quickly. Contact Hakkoda today to discuss leveraging our nearshore team and Snowflake concierge capabilities to bring these ingredients to your business.
More from Hakkoda:
Download our eBook: THE STATE OF DATA AND THE HIGH COSTS OF DATA SPRAWL
Read other blogposts: