The most popular and comprehensive Open Source ECM platform
Big Data: Are Expectations Too High?
2013 is expected to be a big year for Big Data. About a quarter of $1+ billion companies are actively working on Big Data projects, and a third are in the evaluation and planning phases, according to a study by Actuate/King Research. Of the roughly remaining 40 percent of companies, a lack of staff expertise and concern over the cost of the technology is holding them back from going forward with Big Data initiatives.
An IDC report (courtesy of EMC) found that businesses will increasingly fund their IT infrastructure to stay abreast of increasing amounts of data. “The investment in spending on IT hardware, software, services, telecommunications and staff that could be considered the ‘infrastructure’ of the digital universe and telecommunications will grow by 40% between 2012 and 2020… Of course, investment in targeted areas like storage management, security, Big Data, and cloud computing will grow considerably faster.”
There’s a lot of data out there waiting to be analyzed. IDC says that less than .5 percent of the thousands of exabytes of data which exist are analyzed. But they also estimate that data analytics may be useful on only about a third of all stored data. The .5 to 33 percent gap means that there is a lot of untapped data, which would imply the potential for Big Data analytics is high.
But some question if Big Data is ready for prime time yet. A universal opinion is that there is a tremendous shortage of the analytic skills required to do good work with Big Data. And, as a result, there is a worry that because of the huge interest in Big Data, it is likely that there will be a large influx of unqualified workers into the world of data analytics, tainting the quality of work.
There’s also the question if the expectations for Big Data run too high. Nicholas Carr points out an interesting section from Nassim Nicholas Taleb‘s recent book Antifragile that runs counter to benefits that many perceive in analyzing huge amounts of data. Taleb argues that analyzing large amounts of data over very short intervals, such as hourly or daily can often lead to very misleading conclusions. Taleb writes that “The more frequently you look at data, the more noise you are disproportionately likely to get (rather than the valuable part called the signal); hence the higher the noise to signal ratio… The best solution is to only look at very large changes in data or conditions, never small ones.”
And there is a worry that human intuition is being by-passed in favor of black-box-derived Big Data solutions. Steve Lohr writes in a New York Times article that “Listening to the data is important, but so is experience and intuition. After all, what is intuition at its best but large amounts of data of all kinds filtered through a human brain rather than a math model?”