Access and Feeds

AI and ML: Project Success Often Defined by Data Quality

By Dick Weisinger

Garbage in — Garbage out. Machine Learning (ML) and Artificial Intelligence (AI) algorithms often work by looking for patterns that occur across huge volumes of data, but dirty or poor data sets can throw a ringer into AI projects.

Nathaniel Gates, CEO of Alegion, said that“the single largest obstacle to implementing machine learning models into production is the volume and quality of the training data. This research reinforces our own experience, that data science teams new to building ROI-driven systems try to tackle training data preparation in house, and get overwhelmed.

That sentiment is echoed in a Dimensional Research report that surveyed existing AI and ML projects, finding that 80 percent of them have stalled and 96 percent saying that their problems and challenges are typically related to the ability to get and label quality data.

Another report by Cognilytica found that 80 percent of AI project time was spent in prepping the data. The report notes that surprisingly data-prep requires a lot of human intervention.

As AI increasingly becomes important, so too will the pressure for creating tools to effectively clean the data that is being processed. Markets and Markets estimates that the Data prep market will grow from $1.46 billion in 2016 to $3.93 billion by 2021. That’s a 25.2 percent annual growth rate.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)
One comment on “AI and ML: Project Success Often Defined by Data Quality
  1. That’s very nice view about AI

Leave a Reply

Your email address will not be published. Required fields are marked *

*