Access and Feeds

Security: Privacy with Data Anonymity

By Dick Weisinger

Data anonymity is is the collection of data and the subsquent removal from the data set of any personally identifiable information. Data anonymity would allow data sets to be publically published without exposing any personal information from the people whom the data was collected.

While the goal of data anonymity is good, it’s been difficult to achieve. There have been numerous examples where attempts have been made to release anonymized data only later to be found to have been re-identified. Often data can be reindentifed by combining other sources of publicly known information.

TechCrunch listed a few of these incidents:

  • In 1996 health record indentities in Massachusetts health records were exposed by matching voter registration data.
  • In 2006, Netflix movie viewing information was unmasked when combined with IMDB data.

One of the most secure ways for achieving data anonymity is the technique known as differential privacy. A technique developed by Cynthia Dwork for Microsoft Research is now used by data managed by Amazon, Facebook, Apple, and other large tech companies.

Differential privacy adds a certain amount of randomness to the data that can prevent data being revealed when complemented with data from other sources or background. When teh data is queried a technique accounts for the inherit random corrections and treats it as noice which is removed.

Aaron Roth, professor at the University of Pennsylvania, said that “you can dial up to perfect privacy, but then you can do almost nothing useful with the data, or you can go in the other direction and have no real protections. It’s a tradeoff, because privacy protections always come with a cost.”

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)

Leave a Reply

Your email address will not be published. Required fields are marked *

*