Data scientists are considered to have the hottest job right now, but a new study suggests they’re little more than “digital janitors” who spend most of their time cleaning data to prepare it for analysis.
That’s according to CrowdFlower, a crowdsourcing company, which surveyed 80 data scientists with varying levels of experience.
While an advanced degree is usually required for the position, a full 60 percent of respondents said they spend most of their time cleaning and organizing data, leaving little for analytical tasks like building training sets and refining algorithms.
“You have your hardest-to-hire resource spending most of their time cleaning data,” said Lukas Biewald, CrowdFlower’s cofounder and CEO. “It’s a humongous waste for organizations.”
Cleaning and organizing data, as it turns out, is also data scientists’ least favorite part of the job, according to more than half of CrowdFlower’s respondents.
That makes for an unhappy combination, but data scientists remain undaunted: More than 80 percent said they’re happy at work.
CrowdFlower’s findings also confirm that there’s a shortage of data scientists in the business world. In last year’s survey, 79 percent of respondents said there was a shortage; this year, that figure was up to 83 percent.
Want to land a data scientist job for yourself? The most in-demand skills, according to CrowdFlower, are SQL, Hadoop, Python, Java, R, Hive, MapReduce, NoSQL, Pig and SAS.
Coming up next is machine learning, which was singled out as especially important by more than half of the respondents CrowdFlower surveyed.
“Over the last couple of years every CEO has been asking, ‘what’s our big data strategy?'” Biewald said. “They need to start asking about machine learning.”