Data Science Interview Experience for the role of Junior Data Scientist at Bridgei2i
1)What is the difference between Cluster and Systematic Sampling?
2) Differentiate between a multi-label classification problem and a multi-class classification problem.
3) How can you iterate over a list and also retrieve element indices at the same time?
4) What is Regularization and what kind of problems does regularization solve?
5) If the training loss of your model is high and almost equal to the validation loss, what does it mean? What should you do?
6) Explain evaluation protocols for testing your models? Compare hold-out vs k-fold cross validation vs iterated k-fold cross-validation methods of testing.
7) Can you cite some examples where a false positive is important than a false negative?
8) What is the advantage of performing dimensionality reduction before fitting an SVM?
9) How will you find the correlation between a categorical variable and a continuous variable ?
10) How will you calculate the accuracy of a model using a confusion matrix?
11) You are given a dataset with 1500 observations and 15 features. How many observations you will select in each decision tree in a random forest?
12) Given that you let the models run long enough, will all gradient descent algorithms lead to the same model when working with Logistic or Linear regression problems?
13) What do you understand by statistical power of sensitivity and how do you calculate it?
14) What is pruning, entropy and information gain in decision tree algorithm?
15) What are the types of biases that can occur during sampling?
For LinkedIn Post Please Click Here