Data Algebra: A More Rigorous and Mathematical Approach to Data Analysis

By Dick Weisinger

Data Algebra is a technique that some are saying can abstract away much of the grunge and headaches caused by many of the current tools and make managing data, especially Big Data, much easier to do. Data algebra is a mathematical approach for analyzing data sets by apply classical set theory.

Bill Rogers, a senior engineer at IBM, said that “you can do far more sophisticated optimization when you’re using algebraic techniques than you can when you’re just using high-level procedural techniques.”

Charles Silver, CEO of Algebraix Data, a company promoting the use of data algebra, said that “surprisingly, mathematics has played almost no role in software and programming until now – even though it has enabled huge advances in virtually every other form of science and engineering. Data algebra provides a universal language for data and can be the technology that allows the thousands of different data models in the world to be integrated.”

Algebraix Data is promoting data algebra and has open-sourced a Python library that implements the concepts that are explained in a freely downloadable book. The abstract notation of the approach is certainly elegant and it also frames the process of data analysis in a mathematically rigorous way.

But SQL is already a widely-used programming language based on ideas of set theory? What’s different?

The Algebraix Data book calls out SQL as “a mathematical disaster of the first order”, and points out troublesome aspects of the language, like the use of a null value, a concept described as a “wholly unmathematical idea”. Also, because SQL only can handle data in tabular format other types of databases have needed step in where an RDBMS database fails, like for hierarchical data, text and graphs. Programmers have also have long had issues with being able to map data created in object models of programming languages into tables of a database, a frequently-discussed problem called “impedence mismatch”.

The concept certainly has merit, but it’s likely to gain traction only after a serious uphill battle against existing databases and analysis tools.

December 29th, 2015

Category: Big Data

Leave a Reply Cancel reply

Legal Terms & Disclaimers

This blog site is accessed from the website of Formtek, Inc. All visitors to or users of this blog site are subject to the terms and conditions and privacy policy that govern the Formtek website, links for which are provided above.

Some of the individuals posting to this blog site, including the moderators, work for Formtek. Postings by these individuals are the personal opinions of these individuals, not of Formtek. Their posted content is provided for informational purposes only and is not meant to be an endorsement or representation by Formtek or any other party. Postings to this blog site may be outdated, invalid or inaccurate by the time you read them. Individuals posting to this blog site make no statements, representations or warranties as to the timing, validity, accuracy or reliability of their postings.

This blog site may contain links to third party sites. Access to any third party site linked to this blog site is at your own risk. None of Formtek, the blog site moderator(s) and the individuals posting on this blog site that work for Formtek is responsible for the timing, validity, accuracy or reliability of any information, data, opinions, advice or statements made on these third party sites. These links are provided merely as a convenience and do not imply any endorsement.

Postings to this blog site are available to the public. You should not post, link to or otherwise upload any information considered confidential to this blog site. All postings to this blog site are moderated. Postings will appear if and when they are approved by the moderator. Notwithstanding any approval by the moderator, by posting information to this blog site, you agree to be solely responsible for the information you post, link to, or otherwise upload to the blog site. You agree to release Formtek from any liability related to that information or to your use of the blog site. You grant Formtek a worldwide, perpetual, irrevocable, royalty-free, fully-paid, and transferable (including rights to sublicense) right to exercise all copyright, publicity, and moral rights with respect to any information you post, link to or otherwise upload to this blog site.

Data Algebra: A More Rigorous and Mathematical Approach to Data Analysis

Leave a Reply Cancel reply

Company

Products and Services

News

Resources

Legal Terms & Disclaimers