Access and Feeds

Unstructured Data: A Goldmine of Information Waiting to be Mined

By Dick Weisinger

Unstructured data refers to documents and files made up of freeform information. Information from unstructured data is difficult to retrieve because it is typically dense in text and has no pre-defined data model like a database schema, XML or JSON structure.  The documents often contain useful numbers, dates, facts and analysis.  That’s unfortunate because unstructured data is often overlooked when using tools like data analytics.

A recent report looked at which data sources business analysts typically use when applying data analytics.  Analytics go-to source of information is typically internal data (70 percent), business systems data (59 percent) and structured data (58 percent).  Despite the fact that many tools like Hadoop were designed with a focus on unstructured data, extracting useful information from unstructured data typically still takes significant time, expertise and often special techniques.  As a result, analysts use unstructured data sources on only 37 percent of projects, according to the report by Clutch.

The Clutch report found that even with the analysis of newer technologies like Internet of Things (IoT), social networks and external data, most analytics is done using internal, business systems and structured data.

Leif Hanlen, a business development executive at Data61, said that “unstructured data sitting inside the enterprise—in the customer relationship-management system, in fields called ‘other’—is like a hole in the ground that’s yet to become a goldmine. The task for analytics of unstructured data is not to build a brand new goldmine, but to extract elements of information from that unstructured data.”

 

 

 

 

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)
One comment on “Unstructured Data: A Goldmine of Information Waiting to be Mined
  1. CareSkore says:

    Relevant insight here! Thanks for sharing. One of the challenges that we find in healthcare IT is structuring unstructured healthcare data. The insight and trends found in these is, like you mention – invaluable.

Leave a Reply

Your email address will not be published. Required fields are marked *

*