Untangle your data spaghetti
Since we started our business at CROPLAND, back in 2013, we have seen many famous quotes on big data, data analytics or data in general. Here are some of our favorites:
“Data is the new oil …”
“In God we trust, all others must bring data”
“If content is King, then data is King Kong”
“Data is the new science, big data holds the answer”
“Without data you’re just another person with an opinion”
Today, in 2020, data analytics, and in a broader sense data driven decision-making are becoming increasingly mainstream in business settings; in some organizations it has even been completely integrated in their daily operations.
In such situations, it’s not uncommon for people to turn around and say “Well … mission accomplished” or “Been there, done that … and got the t-shirt. Actually, we should be happy we have reached this station, no?”
Well, there is a potential downside to this explosion of data too … and we, at CROPLAND, call this the data spaghetti. “I’m sorry, data spaghetti?”
Indeed, almost all operational processes in companies nowadays are digitized. This situation provides many opportunities for deep dives and thorough analyses. These days, almost any system allows a user to export data or reports in csv or other text formats. Many analysts and functional managers have started consolidating these information sources and performing what we call disconnected analyses.
THE most frequently used tool for such analyses is Microsoft Excel … and that is where the spaghetti starts.
“There are more Excel sheets on a manager’s laptop, than there are spaghetti strings in a good spaghetti bolognaise.”
No tomatoes, no real bolognaise. No Excel, no data spaghetti
Microsoft has made it terribly easy for data to be integrated in Excel; additional plug-ins or tools like PowerQuery even allow users to program ETL (Extract Transfer Load) operations, without having to contact database administrators or high-end developers. With the introduction of PowerBI, we have seen this evolution expand or even explode.
The tools are not the issue however, it’s how one uses them. Many, if not all, of the data operations in these tools tend to be consecutive; they also contain (hard-coded) links, assumptions and fixed references to (other) temporary data sets.
Some examples:
* Ever experienced VLOOKUP functions that are not copied down to the bottom of an updated table?
* Ever had to update a file with a reference to a source file that appears to be gone?
* Ever forgotten to update the table reference in one of your pivot tables?
…
What happens is that the data itself is uploaded or generated in one of the nicely established ICT systems (databases), but that the analytics (and therefore also the knowledge) is spread out in several different (versions of) Excel sheets into the organization. All fine, until somebody asks you to send them “the latest version of the business plan”.
If this sounds familiar, we feel your pain.
Where are you on your data journey?
Oftentimes, we notice that, as the amount of data in the organization increases, so too does the quality of the decision-making. That is, until a tipping point is reached, and the decision-making quality actually decreases … due to the data spaghetti.
In such situations, managers and analysts often tackle with the following issues:
1. How to ensure one single version of the truth?
Two different people running the same data set, might end up with different conclusions.
2. How to safeguard the quality of the insights and analyses?
Testing is a talent that is often forgotten and it’s very difficult to detect your own errors, so how do we solve this?
3. How to make our analyses sustainable?
Many analyses start from an idea, or a question raised by a manager, but the key is to make them sustainable over time. How do we assure continuity?
4. How sophisticated are my data analyses?
Although today’s (Microsoft) tools are very powerful, they have their limitations too. Some advanced (predictive) analytics are just not possible; what if you want to discover the real power in your data (with data science or A.I.)?
The solution
The solution: an effective data strategy. By implementing an effective data strategy in your organization, you can really start to reap the rewards from the explosive increase in data we have been experiencing over the past decade.
So, if you are a manager who desires one single version of the truth based on an efficient reporting that can encompass complex logics, if you are a manager who wants to harness the full potential of their data and discover the power of Artificial Intelligence … then give us a call.