čtvrtek 7. srpna 2014

Interviews to Data Scientist role

After almost year of self study of Data Science courses via Coursera, DSE program and another MOOC pages I had chance to go to Data Scientist role interviews. So far two.

First one was a couple weeks ago I was in interview to the Researcher / Data Scientist position. Mainly about prediction models, math, statistics and data mining/machine learning. Mostly in R connected with Hadoop and some other mathematical tools. This was the first for such position.

Second one was a couple days ago and it was interview to the purely Data Scientist position. Including all data science work from many different sources with project driven or data driven approach, it depends. This was most recent experience and I hope not a last one.


In both I got questions about my career change from development to data and business, which was expected. And because I want to be business/data analyst, a data scientist is a part of it or middle step between developer and analyst. That's it.

I got a test in both cases, first one was project in R with data set processing and answer couple of questions (exploratory analysis, correlation, etc.) with output as document. It takes a time, but I had my own space for thinking about problem and one week of duration to solve it. So, I invested 8 hours into it with decent result.

Second one was on demand response on problem within interview. I got a clustering problem and needed to think about it, named a problem, brought method how to solve it and draw process steps how to solve it. I more or less did it with many hints from interviewer. Solving problems under the pressure isn't easy especially when it leads to your final evaluation on interview.

Anyway my lesson learned points from both are:

  • Know or learn business for which you want to do data science or at least don't forget to ask about what use cases they are working on.
  • Learn or at least be familiar with Hadoop tools/components (HBase, Hive, Pig, R & Hadoop, etc.)
  • Learn MapReduce fundamentals and implement basic examples at least in one programming language (Python, Java)
  • Learn R and/or another math, statistics tool, for example MathLab or Octave, get familiar with libraries, process how to solve different use cases
  • Get familiar about exploratory analysis and statistical inference
  • Get familiar with data mining and machine learning methods and principles (both regression and clustering) not only how to use libraries, but also basic principles
  • Do as much as possible projects on kaggle.com or your personal ones (start with simple, but do not avoid complex one) with define your own problems (projects) and process them from beginning to the end with focus on business value which it brings or some reasonable output which you can publish.
And finally I don't know if I have been successful with at lest one of those interviews. They let me know sooner or later or never :-). I think given my experiences I was more or less good, but they are continue with looking for someone better. So, I need to become better and continue with my personal improvement in this field. Next time it would be better!

Žádné komentáře:

Okomentovat