Beyond big: Solving for data variety

Big Data is now mainstream at large companiessays a recent study, but many people still dislike the term.

Add me to the list. Size is only one part of the Big Data challenge, and the word “big” is preventing organizations from solving for the whole equation—volume, velocity and variety. IT can help the business get the most value from Big Data initiatives by thinking beyond volume and velocity and helping solve for data variety.

Everybody understands volume. Most people also understand velocity—the constant stream of data coming from the web, sensors, mobile devices, and social media. So, the conversation tends to be about “speeds and feeds”—how much data we can analyze and how fast? More and faster is great, but it doesn’t solve the entire problem.

Variety, the proliferation of data from many sources, internal and external, public and private, and in many different formats, gets lost in this discussion. It’s harder to understand and handle, and up until now there haven’t really been good technology solutions for it. However, when I talk to business users, variety is where they want to focus. In a study done by ClearStory Data, 74% of businesses would ideally like to harmonize data from more than four disparate data sources for analysis.

These data sets could actually be small by Big Data standards, even as small as an Excel spreadsheet or even very large data sets in flat files. Usually these spreadsheets hold data from an internal system of record and have been massaged and maintained to meet some unaddressed need. The business views this as a trusted data source, because they know exactly where it came from and what’s been done to it.

A manufacturing company might get data from retail partners, but each data set looks different. The manufacturer calls a part ABC, one retailer calls it DEF and another calls it XYZ. They might also get supply chain and distribution partner data to add to the mix. Someone in the business spends a lot of time on data munging to try to bring all these data sets together and make sense of them, which is neither scalable nor strategic.

For example, few weeks ago, I met with a sales representative at a large media company who sells the company’s most profitable solution. She spends a whole day every week harmonizing four data sources to extract the intelligence she needs to be effective. That means she can only spend four days selling. It also means by the end of the week, she’s working with out-of-date intelligence.

These are the kinds of problems the business really needs help with. To be the hero, IT needs two things: analytics solutions that can harmonize disparate data and process it in real time, and a new mindset. The solution you can now buy, once you’ve acquired the mindset.

Addressing data variety requires a different mindset because no one can say for certain what the business value will be. Variety is the “why” of the analysis. It’s not necessarily something that can be measured and optimized the same way more and faster can.

This is why a lot of Big Data initiatives are still in search of a use case. You can have petabytes of data coming in and processing in near-real time, but the moment you want to explore the why, you have to have variety. You have to be able to see outside the four walls of your business. What’s going on with your retailers and supply chain? Is weather impacting sales? Have the demographics of a territory or territories shifted? Is a competitor running a new marketing campaign? If you’ve ever looked at outliers in your data for which you couldn’t find an internal explanation, then you understand why being able to handle data variety is critical.

Having a solution that can mash-up disparate data sources together without manual intervention significantly reduces the workload on the business, freeing them up to focus on applying their knowledge and experience to the data to glean insights.

If IT really wants to help the business make better decisions, the conversation needs to start with what questions they need answered, what data sets might be useful to analyze, and how they can give the business access to them. Variety is where the business is going to realize tremendous value from analytics, and we could perhaps have a more productive conversation about that if we took the word “big” out of it.

Post Tags:

Leave a Reply

Your email address will not be published. Required fields are marked *