Interview Experiences

Interview QnA ZS & Legato

Data Scientist Interview QnA Company: ZS & Legato 1. Meaning when p values are high or low? High p-values indicate that your evidence is not strong enough to suggest an effect exists in the population. An effect might exist but it’s possible that the effect size is too small, the sample size is too small, …

Interview QnA HP & Capital One

Data Scientist Interview QnA Company: HP & Capital One 1. If through training all the features in the dataset, an accuracy of 100% is obtained but with the validation set, the accuracy score is 75%. What should be looked out for? Training accuracy is much higher than validation accuracy, proving that it’s the case of …

Interview QnA Capgemini

Data Scientist Interview QnA Company: Capgemini 1. Conditions for Overfitting and Underfitting. If both the training accuracy and test accuracy are close then the model has not overfit. If the training result is very good and the test result is poor then the model has overfitted. If the training accuracy and test accuracy is low …

Interview QnA CRED

Data Scientist Interview QnA Company: CRED 1. What do you understand by the term Normal Distribution? Normal distribution, also known as the Gaussian distribution, is a probability distribution that is symmetric about the mean, showing that data near the mean are more frequent in occurrence than data far from the mean. In graph form, normal …

Interview QnA Deloitte

Data Scientist Interview QnA Company: Deloitte 1. Difference between Correlation and Regression. The main difference in correlation vs regression is that the measures of the degree of a relationship between two variables; let them be x and y. Here, correlation is for the measurement of degree, whereas regression is a parameter to determine how one …

Interview QnA Genpact

Data Scientist Interview QnA Company: Genpact 1. Inter quartile ranges? The most effective way to find all of your outliers is by using the interquartile range (IQR). The IQR contains the middle bulk of your data, so outliers can be easily found once you know the IQR. Quartiles divide the entire set into four equal parts. So, there …

Interview QnA L&T Financial Services

Data Scientist Interview QnA Company: L&T Financial Services 1. Assumptions in Multiple linear regression The regression has five key assumptions: Linear relationship. Multivariate normality. No or little multicollinearity. No auto-correlation. Homoscedasity 2. Entropy Entropy is a measure of disorder or uncertainty and the goal of machine learning models and Data Scientists in general is to reduce …

Interview QnA Larsen And Tourbo

Data Scientist Interview QnA Company: Larsen and Tourbo 1. Ways to avoid overfitting Some steps that we can take to avoid it: 1. Data augmentation 2. L1/L2 Regularization 3. Remove layers / number of units per layer 4. Cross-validation 2. Image classification algorithms Image Classification algorithms are the algorithms which are used to classify labels for …

Interview QnA Philips

Data Scientist Interview QnA Company: Philips 1. Time Series (ARIMA)? ARIMA, short for ‘AutoRegressive Integrated Moving Average’, is a forecasting algorithm based on the idea that the information in the past values of the time series can alone be used to predict the future values. 2. How to reduce overfitting ? Techniques to reduce overfitting: …

Interview QnA Tredence

Data Scientist Interview QnA Company: Tredence 1. Explain how the filter function works in python? The filter() method filters a series using a function that checks if each element in the sequence is true or not. The filter() function takes two arguments: function – a function and iterable – an iterable like sets, lists, tuples …

Scroll to Top