Access and Feeds

Data Science: The Drudgery of Data Prep

By Dick Weisinger

Despite the glamour of being a Data Scientist, the reality is that much of the job is a bit of a drudge. A report by Anaconda found that a little less than half the time spent by a Data Scientist is is prepping data — cleaning it and then loading it. Although this may be an improvement from previous estimates ranging as high as 80 percent of time being spent on data prep.

Source: Anaconda; The State of Data Science 2020
Moving from hype toward maturity

The report found that getting to the point where the data scientist has results that can be successfully pushed out into production and used can be challenging. It takes significant time, not just to prep data, but to build the model and to create visualizations of the results. As a result, about half of data scientists say that their work has had little impact on business outcomes.

Peter Wang, Anaconda CEO and Co-Founder, said that “data science has the ability to be transformational for businesses, but our 2020 survey shows that both organizations and professionals in the space are still in the process of maturing. From broadening the data science educational curriculum to being more intentional with open-source security, there are clear learnings here for the industry at large to implement in order to improve. We’ve seen positive progress in many of these areas, but there is still work to be done.”

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)

Leave a Reply

Your email address will not be published. Required fields are marked *

*

twelve − ten =