Interview Experience for the role of Data Scientist at CodeBase Solutions

1.     What are the ML techniques you’ve used in projects?
2.     Very first question was PCA? Why use PCA?
3.     Types of Clustering techniques (Not algorithms)? Which Clustering techniques will you use in which Scenario – example with a Program?
4.     OCR – What type of OCR did you use in your project – Graphical or Non – Graphical?
5.     OCR – What is a Noise? What types of noise will you face when performing OCR? Handwritten can give more than 70% accuracy when I wrote in 2012 but you’re saying 40%.
6.     Logistic Regression vs Linear Regression with a real-life example – explain?
7.     Is Decision tree Binary or multiple why use them?
8.     Do you know Map Reduce and ETL concepts?
9.     What is a Dictionary or Corpus in NLP and how do you build it? 
10.  How do you basically build a Dictionary, Semantic Engine, Processing Engine in a NLP project, where does all the Synonyms (Thesaurus words go).
11.  What are the Types of Forecasting? What are the ML and DL models for forecasting (He said Fast-forwarding models as example) other than Statistical (ARIMA) models you’ve used in your projects?
12.  What is a Neural Network? Types of Neural Networks you know?
13.  Write a Decision Tree model with a Python Program.
14.  How do you build an AZURE ML model? What are all the Azure products you’ve used? I said Azure ML Studio.
15.  Cibil score is an example for Fuzzy model and not a Classification model.
16.  What is an outlier give a real life example? how do you find them and eliminate them? I gave an example of calculating Average salary of an IT employee. 

Date: 19/06/21
