Time flies when you ‘R’ having fun

Categories: Blog Posts,News

As you know from an earlier blog post, Just-BI is very keen on staying up to date and knowledgeable of the latest developments. Therefore, some of us are participating in the Data Science course @ DIKW. And as we find it important to share knowledge with our customers, we want to keep you posted.

Here are just a few things we have learned during training days 2&3.

  • Programming in R and Python
  • Preparing datasets for Data Mining
  • Creating models using methods like:
    • Logistic Regression
    • Classification
    • K-means
      and understanding their algorithms from both a functional and mathematical perspectiveTraining Models
  • Validating Models
  • Comparing ROC curves and confusion matrixes

But let’s maybe start with the big picture and look at some of the most important crafts of Data Science first.
What is it all about? 

  • Discover unknown unknowns in data
  • Obtain predictive, actionable insights
  • Communicate business data stories
  • Build business decision confidence
  • Create valuable Data Products

Is it a bird? Is it a plane? Unless Superman works for you, you can’t possibly get all of that done by one single person. A Data Science TEAM is required, and not just any team of random people; it should consist of people with a variety of skills.

Here is an overview of the range of skills which should be covered by a Data Science team

source-h-d-harris-et-al-2013-analyzing-the-analyzers2

Source: Credit: H. Harris et.al. “Analyzing the Analyzers”

And in this picture you can see how the skill sets should be distributed between the different roles within a  Data Science team

analyzing-the-analyzers-e1409072743262

Source: Credit: H. Harris et.al. “Analyzing the Analyzers”

Once a Data Science team is in place and ready to start, the process of Data Mining (otherwise known as “work”) can begin. But where and how to start?

The Cross Industry Standard Process for Data Mining is considered the leading methodology to tackle possible problems and provides structure for any Data Mining project. This graphic illustrates what the work flow looks like and which steps to take first.

512px-crisp-dm_process_diagram

generic-tasks-bold-and-outputs-italic-of-the-crisp-dm-reference-model

Watch this space for coming updates and good reads. In future blogs, we will go into more detail of some topics mentioned here.

And for our Dutch readers (or for everybody who does not mind watching a documentary on Dutch television) some food for thought and interesting insights.

This video called “What makes you click” questions the (ethical) borders regarding collection and usage of personal data by organizations. Who uses this data and for what? Find out why even some of the data collectors are requesting the implementation of an ethical code around personal data.

Viveca Cohen
Author: Viveca Cohen

2 Responses to "Time flies when you ‘R’ having fun"

  1. Bardo Schütz Posted on October 29, 2016 at 11:11 am

    Interesting, please keep sharing. Is just-BI already playing the Big Data game?

    • Viv Cohen Posted on November 28, 2016 at 1:20 pm

      Hi Bardo,

      We are definitely applying the knowledge we gained during our data science training.
      At some clients we are also involved in setting up data science teams.
      If we should call those efforts big data related depends on the definition of big data.
      A good topic for our next blog I would say, thanks for the trigger!

Leave a Reply

Reload Image