By: Sonny Zulhuda
While big data is by now a commonly heard term, dark data is not. Some participants in the recently-held Singapore Symposium whispered to me that they had never heard about the term – so you can say they were in dark about Dark Data.
The term is new to me as well! Except that I have had a little earlier opportunity than those guys to read about it and to finally make sense of it.
It all rooted from the fact that we have had an abundance of data around us, and how much those abundant data are capable of being sourced as information. Yes, it is about Big Data. As we know, Big Data is about quantifying everything possible to be a data. A person’s identity is no longer depending on what is printed on documents (ID, passport, certificates) about him. A person is now identifiable from his mumbling words, his movement, his location, his mood and even the pattern of what he will do every day. All those data are being quantified and measured due to their availability from myriads of media, devices, and interactions (both human and artificial). What makes it possible? You name it: Mobile gadgets, Social media, CCTVs and commercial transactions you have been making, to name a few.
In organisational life, the same is happening. More and more data are collected and stored by organisations, manually and electronically. Data of employees (and their mumbling words, movements, location, mood, etc.), of visitors, of business transactions, of internal meetings, of vendor’s works, of all reports, records and repositories, etc. are increasingly collected, stored…. but not necessarily used. In many occasions those data are no longer usable after their first collection, and yet they still fill up the organisation’s storage (recent research indicates that these unusable data may stack up to 70% of oganisations’ data).
Those are dark data. Untapped, untagged and sometimes unknown data.
Now is this: the fact that they remain unused does not mean they are valueless. You can run this simple test: Should you dump all these data to your competitor or any third party, would there be a loss to suffer? What about a competitive loss, breach of secrets, infringement of privacy, reputation loss, legal liability? If yes, then such Dark Data should be urgently managed.
That is the first message that I delivered in my 1-hour talk in Singapore yesterday.