Pages

Friday 18 June 2021

Challenge 1: Data is Hard To Find

 Ever had one of those days....

The other day I was looking for a report that had impressed me about a year ago.  I knew that I had downloaded it and kept it but I wasn't sure where to find it. I ended up doing searches through my computer hard drive, my external hard drive and my cloud storage.  I found it after about 10 minutes of work, but the experience reminded me of some statistics I have read about how much time employees spend looking for data. More recently the increasing use of datasets on cloud storage has created a whole new level of data access challenges.  I recently discovered that my data analytics tool could not connect to a Databricks cloud storage because the enterprise VPN would not let the connection go through.  I had to disconnect from the VPN and then connect to the data store through plain Internet.

This comes from a 2019 Forbes article: 

Numerous studies of "knowledge worker" productivity have shown that we spend too much time gathering information instead of analyzing it. In 2001, IDC published its venerable white paper, "The High Cost of Not Finding Information," noting that knowledge workers were spending two and a half hours a day searching for information.

Since then, we have seen the rise of the cloud, ubiquitous computing, connectivity and everything else that was science fiction when we were kids becoming a reality — including the imminent emergence of AI. Yet in 2012, a decade after the IDC report, a study conducted by McKinsey found that knowledge workers still spend 19% of their time searching for and gathering information, and a 2018 IDC study found that "data professionals are losing 50% of their time every week" — 30% searching for, governing and preparing data plus 20% duplicating work.

If approximately twenty percent of time our working time is spent searching for and gathering information that translates into one day out of five.  If you sum up the total compensation cost for your organization and take 20% of that total - that is the financial investment you (and your organization) are making to find and discover information. We need to do better than that.  

The causes for this include data silos - data being held by one office or individual with limited access by others.  Other causes are lack of integrated data inventories - we don't even know what we have so we cannot find it. Multiple data stores where we may have different information in different places.  Difficult to use document management systems.  It is great to have an enterprise document management system, but if the user interface and the document storage structure is too complicated - it takes forever to find a relevant document. Cloud data adds a whole other level of data silos. 

So what's the solution? Here are some things we can do:
  1. Recognize that we have a data silo issue.
  2. Evaluate the cost of finding data - check with your team members on their experiences in locating data. 
  3. Identify the key data bottlenecks - where is it the hardest to find and retrieve data.
  4. What data governance issues like metadata, data inventories, search tools are necessary to resolve the bottleneck?
  5. Do it, fix it, try it:  change something, have a test or trial to see what works and then deploy. Often it is a small change that can make a big difference. 
Ultimately, data is one of our most strategic assets.  Helping our teams and analysts get to the data and documents they need quicker is going to help everyone do their job better.  We can do this!  Ultimately the fixes are not technical - often they are about better use of what we already have. 

What has been your experience in data bottlenecks and data silos?  What solutions have you seen work?  Feel free to share best practices in the comment section.

Feel free to contact me or connect with me on LinkedIn if you want more information on solutions and options. 

No comments:

Post a Comment

Data Literacy for Leaders

One of the key process blockages in undertaking digital transformation is data literacy. A particular challenge is understanding the needs o...