In 2017, The Economist described data as the oil of the 21st century. It’s a great metaphor because both create value in the right conditions. Oil adds value when extracted, processed, and transported to a customer. Data is similar. Data sitting quietly in a database is nothing but potential value. The way companies move and extract data is not working well, however. Perhaps a more accurate analogy is data as the new plutonium – capable of immense power when properly harnessed but toxic when misused or in the wrong hands.
Sluggish and Expensive: The State of Modern Data
Traditional data methods like ETL (Extract, Transform, and Load) worked well in the low data volume world of the past. Twenty or thirty years ago, the volume of new data created every day was a fraction of today. In 2010, two zettabytes of data were created and used worldwide, but that figure rose to 64 zettabytes by 2020, according to Statista. The sheer volume of data is just part of the story – increasing appetites from data consumers also drive the story.
Consumers and business users have an increasing hunger for data. Eighty percent of consumers want personalization from retailers, according to a survey reported by McKinsey & Co. Delivering on the personalization experience also translates to increased revenue. McKinsey found that grocery companies that personalize at scale can increase total revenue by 1-2% while reducing sales and marketing costs by 10% or more – a huge impact in a low-margin, high-volume business. The gains in higher-margin industries may be even more impactful.
Analytics is a Core Business Function
Delivering on that expectation will require the capability to access data quickly and cheaply. Gartner points out that data analytics is fast becoming a core business function. Further, high-quality data is required to power increasingly sophisticated machine learning applications. It will be nearly impossible to achieve your data automation goals when there are bottlenecks in your data processes and the way data is cleaned and prepared.
The way we’re managing data today used to make sense. The world has changed. To stay competitive and gain an edge, you need to boost data automation. A few years ago, launching data automation was difficult. Organizations had to take on the function’s cost, ownership, and administration. You had to recruit a data team of expensive, hard-to-find data scientists. Once the team was in place, custom software development to progress. Thanks to new tools, those formidable barriers to entry into data are also starting to fade away.
The New Data Automation Strategy
The new way to look at data requires a new mindset. In the past, much data strategy was focused on databases with logical order. The structured nature of databases made the first wave of data automation possible. Without databases, creating complex financial reports would require vast armies of DBAs and administrative personnel.
At this stage, most companies have nearly exhausted the potential of traditional databases and business intelligence tools. The next data opportunity will require new tools and a different mindset: data pipelines. So what are these three key ingredients to modernization?
Ingredient 1: Build A Data Pipeline
What Is A Data Pipeline?
A data pipeline is a series of automated actions that gather data from multiple sources and deliver it to the user at a specific point and time where it adds value
Unlike a traditional database, a data pipeline emphasizes constant movement and multiple sources. A data pipeline may combine existing data from your organization and leverage outside data sources. For example, a data pipeline at a property insurance company might combine an internal actuarial database with external weather pattern data to refine insurance pricing.
A data pipeline is a way to give analysts and customers access to real-time insight rather than waiting for reports. Since 2020, we’ve seen many supply chain delays and disruptions. According to NBC News, Walmart and Home Depot pursue highly creative solutions to these challenges by chartering their cargo ships. However, this solution carries a significant price tag of as much as $40,000 per day for a vessel. Before incurring those expenses, companies would be better served to have a data-driven cost-benefit analysis.
A data pipeline that connects inventory levels and sales data can play a role in guiding supply chain decisions. However, once the pipelines are in place, you may face a new problem. A massive increase in data volume creates the opportunity to make thousands of decisions to optimize your business. That’s where machine learning operations add value to your data.
Ingredient 2: Machine Learning Operations (ML Ops)
Machine learning operations is the deployment of machine learning models by DevOps teams. It means bringing machine learning out of the lab and using it to manage parts of your business. A couple of examples:
- Machine Learning In Banking
An Australian financial services company is now using machine learning to improve its fraud detection. Specifically, the company was able to create new models to improve fraud detection significantly. Improving fraud detection means fewer fraud costs and reduced reputational risk.
- Applying Machine Learning In Regulated Industries
Developing new technology and treatments in the health field is challenging. Legacy technology holds back innovation, and meeting regulatory compliance requirements is expensive. For example, Oravizio, a medical device software product, is using machine learning operations to enhance its product while meeting strict European Union regulations.
Ingredient 3: Move Data To Where It Creates Value
Trapping data behind walls and hard-to-reach places puts a hard limit on the value you can create from your data. The solution is to emphasize moving data to where it will create the most value.
As a starting point, look at internal and external users’ needs. For internal users, it is crucial to look beyond the usual suspects of data users (e.g., BI experts and analysts). For example, using a no-code app like Unqork is an excellent way to empower less technical employees with data. Next, move data to external users, including customers and consultants. Finally, to create the most value, aim for real-time or near real-time data to end-users.
The Path To Data Automation Starts Here
Your first step in data automation is choosing the right data platform like Snowflake. The second step is to get support, consulting, and training to achieve data automation quickly. Contact Hakkoda today to discuss leveraging our nearshore team and Snowflake concierge capabilities to bring these ingredients to your business.