According to a New York Times article by Steve Lohr (2014), data scientists spend 50% to 80% of their time on data cleaning and transformation processes called data wrangling and 20%-50% of their time on data modeling, implying the importance of skills needed for the data wrangling task.

“Data scientists, according to interviews and expert estimates, spend from 50 percent to 80 percent of their time mired in this more mundane labor of collecting and preparing unruly digital data, before it can be explored for useful nuggets (Steve Lohr, August 17, 2014).”

However, most degree programs focus on data modeling, presumably because that is most technically challenging and worthy of a degree. Most courses in various types of data science programs do not offer a course in data wrangling and visualization systematically, but they expect students to use data wrangling and visualization in conjunction with modeling, making students face two challenges at the same time. The same is true in most statistics classes. Students have to deal with learning not only statistics topics but also programming software. Thus, this certification is designed to help students without much basic knowledge of R, a primary statistical analysis software used by data scientists, by giving them the necessary knowledge in programming so that they can focus more on statistics/machine learning topics in their future endeavors. Further, this course is also aimed to give data science aspirants introductory knowledge and skills to help them get started.

Required fields are indicated by .