Access and Feeds

Data Hoarding: An Expensive Habit Most Businesses Can't Shake

By Dick Weisinger

With all the hype around Big Data and the ease with which data can now be captured and stored, it’s no wonder that data hording is becoming a problem for companies.  Data hoarding is the excessive storing of data — data that is outdated or no longer relevant.  What can complicate the problem is when data is stored in incorrect or insecure places.

What’s wrong with never deleting any data once it’s been stored?

Studies on the usage of files stored in file shares and repositories have found that 80 percent of the files stores in that locations haven’t been accessed for three to five years.  The costs for keeping these unused files can be expensive — costs for infrastructure, backups and recovery, and data migrations.   But the operations costs can potentially be dwarfed by legal costs that could potentially arise.  During the legal discovery period, every piece of data concerning the issue in question needs to be examined to determine whether or not it is relevant.  Legal costs for eDiscovery of data average $10-20,000 per gigabyte of data.

But, shouldn’t we keep all our data if we plan to do Big Data analytics?  Jake Frazier, attorney at IBM, told InformationWeek that “‘Well, we don’t want to delete our data if we can mine value out of it.  I think that’s a false dichotomy. Once you delete data that’s stale, the algorithms actually function much better from an analytics standpoint. Leaving stale data can actually skew the algorithms towards older facts.”

Amber Simonsen, consultant and project manager, said that “you get to a point where you have to ask yourself what you plan to produce with the data.  We’re all so data-hungry that sometimes our eyes are bigger than our stomach. Organizations have a ton of data – there’s no end in sight – but is data from five years ago really going to provide you with anything valuable?…  What begins as an innocent desire to keep relevant information close at hand can turn into an unhealthy obsession that plagues IT departments and Records Managers in organizations everywhere.”

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)

Leave a Reply

Your email address will not be published.

*

19 − 13 =