Learners are Working Professionals
Learners are Students and Job seekers
I have transitioned my career from Manual Tester to Data Scientist by upskilling myself on my own from various online resources and doing lots of Hands-on practice. For internal switch I sent around 150 mails to different project managers, interviewed in 20 and got selected in 10 projects.
When it came to changing company I put papers with NO offers in hand. And in the notice period I struggled to get a job. First 2 months were very difficult but in the last month things started changing miraculously.
I attended 40+ interviews in span of 3 months with the help of Naukri and LinkedIn profile Optimizations and got offer by 8 companies.
Your first assignment consist of various basic concepts of python. This language is important for any learner in order to drive various machine learning based solutions in this course.
This assignment keeps beginners in mind and covers topics like python variables, numeric python operators, Logical Operators, various loop statements(If, while and for) , Functions in python, strings and their operations/functions, list and list comprehension along with reference videos on every topic. Learners can learn by solving problems on each concept of python covered here.
Since data structures are way of organizing and storing data, hence it becomes important topic. This assignment aims to cover various python data structures and their implementations. Topics which are covered are Lists and it’s operations such as slicing, deleting ,appending, updating etc. List comprehensions, Sets and it’s operations like union, intersection, diferences etc, Tuples and its implementation, Dictionaries and its operations like adding and removing key value pairs, iterating item values etc. Along with handson problem added reference videos on every topic covered.
Numpy aims to provide an array object that is up to 50x faster than traditional Python lists, this assignment will help learners to optimize their code using it. Numpy Assignment covers topics like defining various different dimensions of numpy arrays, Various Numpy functions to create arrays like arange(), eye(), full(), diag(), linespace() etc, Defining Numpy array with random values, Reshaping arrays to different dimensions, Numpy array indexing and slicing, Difference between Numpy copy and view function, Bonus operation on numpy like hstack() and vstack(), Numpy array modifications using insert, delete and append functions, Mathematical operations and searching in Numpy arrays. Also shown practical operation on how arrays are faster than lists. To understand all topics thoroughly, we have added reference links on each topic.
Pandas has functions for analyzing, cleaning, exploring, and manipulating data, which makes it important library for data science. This assignment introduces topics on pandas like pandas series and its operations like sort, append, indexing etc, Pandas dataframes and its operations like accessing existing rows, columns, adding new rows or columns. Converting series to dataframes, Concatenation of one or more dataframes, dataframeelement acess using conditions, dataframe Indexes, loc and iloc, reading csv, merging, groupby and apply function. For more conceptual clarity we also added reference videos for all the topics.
Data Cleaning plays an important role in the field of Data Managements as well as Analytics and Machine Learning. This assignment will give you practical experience on how to handle any dirty data. You will learn how to treat inconsistent/irrelevant columns in the data, Handling Missing values by dropping empty records, imputing missing fields using techniques like forward fill, backward fill, mean imputation, constant imputation, interpolation and knn, Pandas data frame shallow and deep copy methods, Working and optimizing code with iterrows and itertuples, renaming columns with meaningful labels, treating duplicate values, Treating constant( low variance) column values. Implementing Regular expressions on textual data to play with different patterns. For more conceptual clarity also added reference links on each topic.
Regular Expressions, or regex or regexp in short, are extremely and amazingly powerful in searching and manipulating text strings, particularly in processing text data. One line of regex can easily replace several dozen lines of programming code. In this assignment you will be solving easy to hard level regex problems like matching digit and non-digit characters, detecting HTML tags in text, IP address validation, detecting email addresses, detecting domain problems, whitespace and non-whitespace problems, and substring problems. Provided reference video and document links for any assistance. It is widely used in projects that involve text validation, NLP and text mining, hence Regex has become a useful tool to know.
Exploratory Data Analysis is a way of visualizing, summarizing and interpreting the information that is hidden in rows and column format. In this assignment you will be applying Data cleaning techniques which you learned in previous assignment, Method to fetch basic statistical information out of data, Detecting Outliers which pollutes the data, outlier removal techniques like IQR, and Z-score and removing them to make a uniform dataset. You will implement Univariate plots like box plot, Bar plots, Count Plots, Histogram and density plots, Bivariate plots like Scatter plots, Line plots, box plots with respect to third variable and joint distribution plots, also Multivariate plots like Pair plot, multivariate scatter plot, parallel coordinates and Heatmaps. Every topic has a YouTube reference link to give you better conceptual clarity.
Feature selection in machine learning is to find the best set of features to reduce the computational cost and improve the performance of the ML models. This assignment is full of techniques used for Feature selection. Here you will be implementing Intrinsic method like Tree based feature selection using feature importance and SelectFromModel, wrapper methods like RFE and SelectKbest and few filter methods like Missing value ratio threshold, Variance Threshold, Chi2 Test and Anova test, also you will learn about univariate ROC_AUC test and techniques to remove multicollinearity like Variance inflation factor. For our learners we also added reference links on each topic.
Feature engineering is the process of transforming raw data into features that better represent the underlying problem to the predictive models. This assignment would give you chance to feature engineering using various techniques. Using a sample dataset you will create new features out of raw data by calculating sum, subtraction, mean etc, Imputation of missing values both textual and numerical, Handling outliers and detecting it using percentile and standard deviation, Scaling techniques like use of Binning to avoid overfitting, Encoding techniques like One-hot encoding and label encoding, Scaling techniques like normalization and standardization, Implementing Variable transformation using, log, square root, reciprocal and exponential. Date and Time engineering, Feature creation (sum, subtraction, mean etc),Variable Transformation (Log, reciprocal, exponential etc). Reference videos links.
Simple Linear Regression is a type of Regression algorithms. In this assignment you will learn how to build your first Machine learning model using simple linear regression. Here you will learn about mathematics behind working of simple linear regression. Learners will be provided a dataset, where they can find relationship between two variables statistically and visually and find the best fit line for the dataset by fitting the model on training data, you can also look at how your model intercept and slope looks like, Predicting on test dataset and evaluating model using RMSE, R square, Residual square error and learning basics of overfitting, underfitting and assumptions of simple linear regression. Learners are provided with reference link on each topic for better conceptual clarity
Multiple Linear Regression is an extension of Simple Linear regression as it takes more than one predictor variable to predict the response variable. This assignment will give opportunity to implement various steps involved in machine learning like Data cleaning, feature engineering, feature selection and finally building Multiple Linear Regression model using ordinary least squares and evaluating the model using metrics like R square, adjusted R square and visualizing parity, trend and error term plots. Reference link on each topic is also added on each topic for better conceptual clarity.
Advance regression techniques will introduce on Regularization using Ridge algorithm, Lasso algorithm and ElasticNet algorithms in order to reduce the model error and avoid overfitting scenario. In this this assignment you will explore the dataset provided using EDA, you will be required to clean the data, Prepare the data using data engineering techniques, Scaling the training data, Building Ridge, Lasso and ElasticNet models and implementing hyperparameter tuning using GridsearchCV, you will also build polynomial regression to create a generalized model. Model evaluation and selection using adjusted R square, mae, rmse and R square. Youtube reference link on each topic are also provided to give better conceptual clarity.
Logistic regression is the go-to method for binary classification problems (problems with two class values). In this assignment you will discover the logistic regression algorithm for machine learning. You will be Implementing Logistic Regression through a case study where you will play with data, require to clean the data, preparing it using feature engineering methods, removal of outliers, Feature scaling and building your first classification model using Logistic regression, removal of multicollinearity using VIF, Rebuilding the model and evaluating it using confusion metrics, Plotting ROC Curve, selection of cutoff probability and finalizing the best model. Reference links on each topic to give better conceptual clarity.
In this assignment you will discover the Principal Component Analysis machine learning method for dimensionality reduction. You will be able to learn and implement Math behind PCA , Standardization, covariance matrix computation, computing eigen vectors and eigen values and their use, creating Feature vectors and two dimensional visualization for the same. Finally building Principal component based data using a given dataset. Reference videos links are provided on each topic covered.
The k-nearest neighbors (KNN) algorithm is a simple, supervised machine learning algorithm that can be used to solve both classification and regression problems. In this assignment you get handson experience on implementing KNN, where you will learn about Meshgrid concepts, Creating a function for different K values on Nine different variety of datasets. You will be required to solve a use case using KNN, build an effective model, making use of best p-value, improving model by selecting the best k value based on accuracy scores.Reference link on each topic are provided for better conceptual clarity.
Decision Tree algorithm uses the tree representation to solve any classification or regression problems. In this assignment you will get an opportunity to solve a use case using Decision tree. You will be able to discover on concepts of splitting criteria like Homogeneity, Entropy, Gini Index, Information and gain. learn about hyperparameter involved in decision tree and Tuning it to improve the model performance, You will be able visualize your tree structure in order to know logic behind its split. Building you Decision tree Model and evaluating it using confusion metrics. Preventing overfitting Issues in DT using Pruning, where minimum cost complexity method is used. Reference videos on each topic to make your learning smooth.
Naïve Bayes is a probabilistic machine learning algorithm based on the Bayes Theorem, used in a wide variety of classification tasks. In this assignment, we will understand the Naïve Bayes algorithm and all essential concepts so that there is no room for doubts in understanding. Here we will be implementing different Niave Bayes algorithms available like Burnoulli, Multinomial and Guassian algorithms to solve a case study. You will learn how to vectorize words in the textual data using CountVectorizer, Concepts of Bayes Theorem and Laplace Smoothing will be cleared here. Reference link on each topic are provided for better conceptual clarity
Bagging is a machine learning ensemble meta-algorithm designed to improve the stability and accuracy of machine learning algorithms used in statistical classification and regression. In this assignment you discover about Bagging concepts by solving a case study. You will get to know about Weak learners, Bias variance tradeoff, Bagging meta-estimator, Random Forest, Bootstrap method, bootstrap aggregation, Estimated Performance and Variable Importance. Reference links and provided on each topic for better conceptual clarity
Boosting algorithms often outperform simpler models like logistic regression and decision trees. It is a general ensemble method that creates a strong classifier from a number of weak classifiers. In this assignment you will be solving a case study using various boosting algorithms like AdaBoost (Adaptive Boosting), Gradient Tree Boosting, XGBoost, LightGBM and CatBoost. You will get an opportunity to know difference in working of each algorithm and selecting the best model for to solve your problem. Reference video links are provided on each topic to give better conceptual clarity.
Cluster analysis, or clustering, is an unsupervised machine learning task. It involves automatically discovering natural grouping in data.
This assignment will provide you with opportunity to implement various clustering methods to solve a case study. Here you will learn by doing on topics like Affinity Propagation, Agglomerative Clustering, BIRCH, DBSCAN, K-Means, Mini-Batch K-Means, Mean Shift, OPTICS, Spectral Clustering, Mixture of Gaussians clustering methods, Hopkins test and Hierarchical Clustering methods. Reference video links are provided on each topic to give better conceptual clarity.
support vector machine algorithm is to find a hyperplane in an N-dimensional space(N — the number of features) that distinctly classifies the data points to solve any regression or classification problems. In this assignment you will discover SVM by solving hands-on case study. Here you will get to know about Hyperplane, Mathematics Behind SVM, Support Vectors, Hyperparameters in SVM like C, gamma and kernels like linear, rbf and polynomial, Slack Variable and advantage & disadvantage of using it. Reference videos on each topics are provided along with interview videos for better conceptual clarity
Gradient descent is an optimization algorithm that’s used when training a machine learning model, hence it becomes important to know about it. In this assignment we will learn by doing stuffs on Gradient Descent like Defining Cost Functions, Implementing batch gradient Descent, stochastic gradient descent, Optimization, Closed Form Vs Gradient Descent, evaluation using Plot cost vs Time, Learning rate, Rescale inputs, few passes and Plot mean Cost. Reference link on each topic are covered for better conceptual clarity.
Cricket is one of the most popular sports in world, especially in India. The game is highly uncertain. It is the sport which generate high revenue, so what if the winner team of the match can be predicted before the match, even have begin?
In this assignment we are going to predict the future ODI cricket match winner based on previous year’s match result,
In this capstone project we are going to recommend the similar games to the user based on their behavior.
This dataset is a list of user behaviors, with columns: user-id, game title, behavior name, value.
In this capstone project the task is to predict whether the transaction is fraudulent or non – fraudulent using the transaction data.
This dataset contains rows of known fraud and valid transactions made over Ethereum, a type of cryptocurrency. It is an imbalanced dataset project.
This is the capstone project in which the task is to predict the cost of shipping paintings, antiques, sculptures, and other collectibles to costumers based on the information provided in the dataset.
The dataset consists of parameters such as the artist’s name and reputation, dimensions, material, and price of the collectible, shipping details such as the customer information, scheduled dispatch, delivery dates, and so on.
In this capstone project we are going to scrape data from this covid 19 India website and using the data to develop our own dashboard which will update automatically by continuously fetching data after some fixed interval.
Big Brands spend a significant amount on popularizing a product. Nevertheless, their efforts go in vain while establishing the merchandise in the hyperlocal market. Based on different geographical conditions same attributes can communicate a piece of much different information about the customer. Hence, insights this is a must for any brand owner.
So here in this project we are going to predict the popularity from all the other data from dataset like Store Ratio, Basket Ratio, Store Score.
The sentiment of financial news articles reflects and directs the performance of the U.S. stock market. By performing sentiment analysis on the news headlines, we get the label of positive or negative with their confidence scores.
By using this output we can correlate to the stock market’s gains/losses on that particular day.
This is capstone project in which we need to predict the trip duration from all the other data from dataset like distance, location and weather information. Trip duration is the most fundamental measure in all modes of transportation.
Hence, it is crucial to predict the trip-time precisely for the advancement of Intelligent Transport Systems (ITS) and traveler information systems.
This is the first capstone project in which we are going to implement our salary prediction using the machine learning algorithm.
This model predicts the salary of the employee based on the year of experience of employee.
Once you are thorough with all practical/ hands-on experience with other modules. We will assist you with Assorted Interview Questions. Here we provide set of 75+ of the most popular data science interview questions on the technical and basic concepts, which you can expect to face .
These set of questions is the perfect guide for you to learn and practice all the concepts required to clear any Data Science interviews.
We know the trend on how a data science aspirant is tricked in interviews nowadays. Interviewers are trying to pin down your thought processes rather than have you recite learned responses from memory.
But No Fear When CloudyML is Here!:)…we will mould you to be perfectly interview ready by providing the best unique set of Scenarios, Situation and use case based interview questions and sample answers on how you might respond to a hypothetical situation in the future.
Deployment is the method by which you integrate a machine learning model into an existing production environment to make practical business decisions based on data. It is one of the last stages in the machine learning life cycle.
Here learners will get an opportunity to learn Flask, HTML, deal with API’s, building various machine learning models and deploy the same in their local host. Reference videos are provided for better conceptual clarity.
On completion of this module, we can assure that you will able to deploy any machine learning model using flask.
When I started to learn Data Science, my friends suggested that I should join any big institutes by paying lakhs of rupees. That time I was a fresher and didn’t want to spend so much money.
So I started learning from Udemy and Coursera. The tutorials were great but whenever I sat to code a solution for a Data Science problem, I went clueless and had no confidence.
That’s when I started building projects and realized that this way I was able to learn way more things that I was learning by just watching tutorial videos.
And suddenly this idea of Assignment driven learning was seeded in my mind and I implemented it once I was able to join Tredence Analytics as a Data Scientist with huge salary.
The course is designed in the form of Assignments. In Assignments we have topic wise video links followed by related questions. You are supposed to code the solutions.
If you are stuck somewhere then you can reach to our mentors. They are available from 3P.M to 11:59 P.M everyday. After completion you need to submit the assignment and you will receive the solution file.
Our mentors team comprises of M.tech and B.tech students with good technical skill to resolve your course and assignments related queries
The Data Science projects are present in the form of guided assignments. Assignments have instructions and you have to write the code based on these instructions.
It is end to end implementation from Data cleaning, Data Preparation, Feature Engineering, Feature Selection, Model building, Hyperparameter Tuning, Model Selection, Model Evaluation to Model deployment on localhost and Cloud.
No. There is no placement as part of the course. The course focuses on skill development for being able to clear Data science interviews.
This course has been designed keeping beginners in mind. You will be able to learn as we start from basics. We have many learners doing this course who had no prior coding experience.
It’s one time payment and you will get lifetime access to this course experience.
After enrolling you would get access to our learning portal.
There we have uploaded videos, guided assignments and capstone projects. You will find the Skype link in the portal itself. You can ask your queries via live chat from 3pm to 12 midnight everyday.
We also have given Topic wise interview QnA and Sample resumes.