Access and Feeds

Data Lineage: Gaining Insight into the Data used for Analysis and Decisions

By Dick Weisinger

As data is being increasingly used with analytics, machine learning, and business intelligence, it’s important to have insight about the history of the data being used. Where it came from or was derived from? What is the age of the data? What is the quality of it? Who has access to it? Where and who has it been used? How trustworthy is it? Industries like finance and pharmaceuticals have extensive regulations that require the auditing of how data is being used.

Data lineage solutions are designed to answer those kinds of questions about data history, quality, trustworthiness, condition, and usage.

Ian Rowlands, technology author, said that “an enterprise CEO really ought to be able to ask a question that involves connecting data across the organization, be able to run a company effectively, and especially, to be able to respond to unexpected events. Most organizations are missing this ability to connect the data together. It used to be really hard to get CEOs to care about data. A trend that we’ve seen increasingly over the past several years is that CEOs have been getting more and more irritated at the inability to get value out of their data.”

Some use case examples for how data lineage is being applied is the following:

  • A bank adopted data lineage solutions and found that they became 80 percent more efficient in addressing regulatory and data forensic issues.
  • Data lineage is being used in supply chain management to provide information to consumers about what was the origin and history of food products in the grocery store.
  • Nonprofits trace donations received and how funds were allocated and spent.

Vikram Bellapravalu, senior solution leader at McKinsey, said that “there are some big benefits of data lineage, including full transparency for stakeholders, an end-to-end view of data governance, and the ability to get rid of redundant information sources. The business can also see where there might be issues of entitlement, so that they don’t have data leaks, such as in a hack. Even from an IT perspective it provides root cause analysis—to see what would happen if a step were added to or removed from the IT architecture.”

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)

Leave a Reply

Your email address will not be published. Required fields are marked *

*