How CloudyML is different
from anyother online learning platform ?
- Here you will get guidance from our mentors for all assignment related queries. Everyday including Weekends-Saturday and Sunday.
- You will get constant feedback on your assignments which will help you to improve.
- In CloudyML, you will get gentle reminders about your next assignments. We will keep you motivated and Josh “High” !!!
For Whom this course is
- College Students planning to learn Data Science from scratch.
- IT freshers planning to get into Data Science projects.
- IT experienced person planning to switch to Data Science profile.
- Any Mechanical, Electrical, Civil, Electronics and other core branch student can learn Data Science from this course.
No prior coding experience required.
A self-paced AI course with lifetime access.
Total Course Fee
Complete Course Detail
Your first assignment consist of various basic concepts of python. This language is important for any learner inorder to drive various machine learning based solutions in this course. This assignment keeps beginners in mind and covers topics like python variables, numeric python operators, Logical Operators, various loop statements(If, while and for) , Functions in python, strings and their operations/functions, list and list comprehension along with reference videos on every topic. Learners can learn by solving problems on each concept of python covered here.
In this assignment you will be solving logical questions on python, which will benefit you to crack python coding interview rounds. Involves python problems like solving mathematical equations, strings and string manipulation, playing with numbers and their digits and many more similar interesting coding questions. You will also get an opportunity to create a mini python project where you will be creating a game called “Hangman” which can act as an Intelligent game, this game has rich use of loops (for, while, if else etc) to apply various conditions, python functions, string inbuild functions, use of break and continue statements and lists.
Since data structures are way of organizing and storing data, hence it becomes important topic. This assignement aims to cover various python data structures and their implementations. Topics which are covered are Lists and it’s operations such as slicing, deleting ,appending, updating etc. List comprehensions, Sets and it’s operations like union, intersection, diferences etc, Tuples and its implementation, Dictionaries and its operations like adding and removing key value pairs, iterating item values etc. Along with handson problem added YouTube reference videos on every topic covered.
Numpy aims to provide an array object that is up to 50x faster than traditional Python lists, this assignment will help learners to optimize their code using it. Numpy Assignment covers topics like defining various different dimensions of numpy arrays, Various Numpy functions to create arrays like arange(), eye(), full(), diag(), linespace() etc, Defining Numpy array with random values, Reshaping arrays to different dimensions, Numpy array indexing and slicing, Difference between Numpy copy and view function, Bonus operation on numpy like hstack() and vstack(), Numpy array modifications using insert, delete and append functions, Mathematical operations and searching in Numpy arrays. Also shown practical operation on how arrays are faster than lists. To understand all topics thoroughly, we have added YouTube reference links on each topic.
Pandas has functions for analyzing, cleaning, exploring, and manipulating data, which makes it important library for data science. This assignment introduces topics on pandas like pandas series and its operations like sort, append, indexing etc, Pandas dataframes and its operations like accessing existing rows, columns, adding new rows or columns. Converting series to dataframes, Concatenation of one or more dataframes, dataframeelement acess using conditions, dataframe Indexes, loc and iloc, reading csv, merging, groupby and apply function. For more conceptual clarity we also added reference videos for all the topics.
Data Cleaning plays an important role in the field of Data Managements as well as Analytics and Machine Learning. This assignment will give you practical experience on how to handle any dirty data. You will learn how to treat inconsistent/irrelevant columns in the data, Handling Missing values by dropping empty records, imputing missing fields using techniques like forward fill, backward fill, mean imputation, constant imputation, interpolation and knn, Pandas data frame shallow and deep copy methods, Working and optimizing code with iterrows and itertuples, renaming columns with meaningful labels, treating duplicate values, Treating constant( low variance) column values. Implementing Regular expressions on textual data to play with different patterns. For more conceptual clarity also added You tube reference links on each topic.
Regular Expressions, or regex or regexp in short, are extremely and amazingly powerful in searching and manipulating text strings, particularly in processing text data. One line of regex can easily replace several dozen lines of programming code. In this assignment you will be solving easy to hard level regex problems like matching digit and non-digit characters, detecting HTML tags in text, IP address validation, detecting email addresses, detecting domain problems, whitespace and non-whitespace problems, and substring problems. Provided reference video and document links for any assistance. It is widely used in projects that involve text validation, NLP and text mining, hence Regex has become a useful tool to know.
Exploratory Data Analysis is a way of visualizing, summarizing and interpreting the information that is hidden in rows and column format. In this assignment you will be applying Data cleaning techniques which you learned in previous assignment, Method to fetch basic statistical information out of data, Detecting Outliers which pollutes the data, outlier removal techniques like IQR, and Z-score and removing them to make a uniform dataset. You will implement Univariate plots like box plot, Bar plots, Count Plots, Histogram and density plots, Bivariate plots like Scatter plots, Line plots, box plots with respect to third variable and joint distribution plots, also Multivariate plots like Pair plot, multivariate scatter plot, parallel coordinates and Heatmaps. Every topic has a YouTube reference link to give you better conceptual clarity.
Feature selection in machine learning is to find the best set of features to reduce the computational cost and improve the performance of the ML models. This assignment is full of techniques used for Feature selection. Here you will be implementing Intrinsic method like Tree based feature selection using feature importance and SelectFromModel, wrapper methods like RFE and SelectKbest and few filter methods like Missing value ratio threshold, Variance Threshold, Chi2 Test and Anova test, also you will learn about univariate ROC_AUC test and techniques to remove multicollinearity like Variance inflation factor. For our learners we also added You Tube reference links on each topic.
Feature engineering is the process of transforming raw data into features that better represent the underlying problem to the predictive models. This assignment would give you chance to feature engineering using various techniques. Using a sample dataset you will create new features out of raw data by calculating sum, subtraction, mean etc, Imputation of missing values both textual and numerical, Handling outliers and detecting it using percentile and standard deviation, Scaling techniques like use of Binning to avoid overfitting, Encoding techniques like One-hot encoding and label encoding, Scaling techniques like normalization and standardization, Implementing Variable transformation using, log, square root, reciprocal and exponential. Date and Time engineering, Feature creation (sum, subtraction, mean etc),Variable Transformation (Log, reciprocal, exponential etc). You tube reference videos links.
Simple Linear Regression is a type of Regression algorithms. In this assignment you will learn how to build your first Machine learning model using simple linear regression. Here you will learn about mathematics behind working of simple linear regression. Learners will be provided a dataset, where they can find relationship between two variables statistically and visually and find the best fit line for the dataset by fitting the model on training data, you can also look at how your model intercept and slope looks like, Predicting on test dataset and evaluating model using RMSE, R square, Residual square error and learning basics of overfitting, underfitting and assumptions of simple linear regression. Learners are provided with Youtube reference link on each topic for better conceptual clarity
Multiple Linear Regression is an extension of Simple Linear regression as it takes more than one predictor variable to predict the response variable. This assignment will give opportunity to implement various steps involved in machine learning like Data cleaning, feature engineering, feature selection and finally building Multiple Linear Regression model using ordinary least squares and evaluating the model using metrics like R square, adjusted R square and visualizing parity, trend and error term plots. Youtube reference link on each topic is also added on each topic for better conceptual clarity.
Advance regression techniques will introduce on Regularization using Ridge algorithm, Lasso algorithm and ElasticNet algorithms in order to reduce the model error and avoid overfitting scenario. In this this assignment you will explore the dataset provided using EDA, you will be required to clean the data, Prepare the data using data engineering techniques, Scaling the training data, Building Ridge, Lasso and ElasticNet models and implementing hyperparameter tuning using GridsearchCV, you will also build polynomial regression to create a generalized model. Model evaluation and selection using adjusted R square, mae, rmse and R square. Youtube reference link on each topic are also provided to give better conceptual clarity.
Logistic regression is the go-to method for binary classification problems (problems with two class values). In this assignment you will discover the logistic regression algorithm for machine learning. You will be Implementing Logistic Regression through a case study where you will play with data, require to clean the data, preparing it using feature engineering methods, removal of outliers, Feature scaling and building your first classification model using Logistic regression, removal of multicollinearity using VIF, Rebuilding the model and evaluating it using confusion metrics, Plotting ROC Curve, selection of cutoff probability and finalizing the best model. Youtube reference links on each topic to give better conceptual clarity.
In this assignment you will discover the Principal Component Analysis machine learning method for dimensionality reduction. You will be able to learn and implement Math behind PCA , Standardization, covariance matrix computation, computing eigen vectors and eigen values and their use, creating Feature vectors and two dimensional visualization for the same. Finally building Principal component based data using a given dataset. Youtube reference videos links are provided on each topic covered.
The k-nearest neighbors (KNN) algorithm is a simple, supervised machine learning algorithm that can be used to solve both classification and regression problems. In this assignment you get handson ecperience on implementing KNN, where you will learn about Meshgrid concepts, Creating a function for different K values on Nine different variety of datasets. You will be required to solve a use case using KNN and build an effective model with best k value. Youtube reference link on each topic are provided for better conceptual clarity.
Decision Tree algorithm uses the tree representation to solve any classification or regression problems. In this assignment you will get an opportunity to solve a use case using Decision tree. You will be able to discover on concepts of splitting criteria like Homogeneity, Entropy, Gini Index, Information and gain. learn about hyperparameter involved in decision tree and Tuning it to improve the model performance, You will be able visualize your tree structure in order to know logic behind its split. Building you Decision tree Model and evaluating it using confusion metrics. Preventing overfitting Issues in DT using Pruning, where minimum cost complexity method is used. Youtube reference videos on each topic to make your learning smooth.
Naïve Bayes is a probabilistic machine learning algorithm based on the Bayes Theorem, used in a wide variety of classification tasks. In this assignment, we will understand the Naïve Bayes algorithm and all essential concepts so that there is no room for doubts in understanding. Here we will be implementing different Niave Bayes algorithms available like Burnoulli, Multinomial and Guassian algorithms to solve a case study. You will learn how to vectorize words in the textual data using CountVectorizer, Concepts of Bayes Theorem and Laplace Smoothing will be cleared here. Youtube reference link on each topic are provided for better conceptual clarity
Bagging is a machine learning ensemble meta-algorithm designed to improve the stability and accuracy of machine learning algorithms used in statistical classification and regression. In this assignment you discover about Bagging concepts by solving a case study. You will get to know about Weak learners, Bias variance tradeoff, Bagging meta-estimator, Random Forest, Bootstrap method, bootstrap aggregation, Estimated Performance and Variable Importance. Youtube reference links and provided on each topic for better conceptual clarity
Boosting algorithms often outperform simpler models like logistic regression and decision trees. It is a general ensemble method that creates a strong classifier from a number of weak classifiers. In this assignment you will be solving a case study using various boosting algorithms like AdaBoost (Adaptive Boosting), Gradient Tree Boosting, XGBoost, LightGBM and CatBoost. You will get an opportunity to know difference in working of each algorithm and selecting the best model for to solve your problem. Youtube reference video links are provided on each topic to give better conceptual clarity.
Cluster analysis, or clustering, is an unsupervised machine learning task. It involves automatically discovering natural grouping in data. This assignment will provide you with opportunity to implement various clustering methods to solve a case study. Here you will learn by doing on topics like Affinity Propagation, Agglomerative Clustering, BIRCH, DBSCAN, K-Means, Mini-Batch K-Means, Mean Shift, OPTICS, Spectral Clustering, Mixture of Gaussians clustering methods, Hopkins test and Hierarchical Clustering methods. Youtube reference video links are provided on each topic to give better conceptual clarity.
support vector machine algorithm is to find a hyperplane in an N-dimensional space(N — the number of features) that distinctly classifies the data points to solve any regression or classification problems. In this assignment you will discover SVM by solving hands-on case study. Here you will get to know about Hyperplane, Mathematics Behind SVM, Support Vectors, Hyperparameters in SVM like C, gamma and kernels like linear, rbf and polynomial, Slack Variable and advantage & disadvantage of using it. Youtube reference videos on each topics are provided along with interview videos for better conceptual clarity
Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In this assignment you will learn about Inferential statistics, Probability Theory, Probability Distribution, Binomial Distribution, Cumulative Distribution, Normal Distribution, z score, Sampling, Sampling Distribution, Central Limit Theorem and Confidence Interval. YouTube reference videos links on each topic along with some quizzes are provided for better conceptual clarity.
Amazon Web Services (AWS) is a secure cloud services platform, offering compute power, database storage, content delivery and other functionality. Here learners will get opportunity to learn Flask, Docker, AWS ec2 instance, S3 storage, building various machine learning models in AWS and deploying the same . YouTube reference videos links are provided for each topic for better conceptual clarity.
.It is designed for managing data in a relational database management system (RDBMS). It is important for every data science person to know SQL. In this assignment you will learn many SQL queries for managing data like Select, Select Distinct, Where, And or Not, Order by, Insert Into, Null values, Update, Delete, Select Top, Min, Max, Count, Avg, Sum, Like, Wildcards, In between, SQL Aliases, Joins, Inner Join, left Join, Right Join, Full join, Self join, Union, Group By, Having, Exists, Any All, Select Into, Insert into select, Case, Null functions, Stored procedures, Comments. Operators. Youtube reference link on each topic are provided for better understanding.
Hypothesis testing is a statistical method that is used in making statistical decisions using experimental data. It is widely used to evaluates two mutually exclusive statements about a population to determine which statement is best supported by the sample data. In this assignment you will be learn and implement Hypothesis Testing methods, we will know about NULL and Alternate Hypothesis, One/Two Tailed Tests, Critical Value Method, z Table, Implementing practical hypothesis example, P value, Types of Error, t-distribution. Youtube reference videos for all topics covered for better conceptual clarity
Gradient descent is an optimization algorithm that’s used when training a machine learning model, hence it becomes important to know about it. In this assignment we will learn by doing stuffs on Gradient Descent like Defining Cost Functions, Implementing batch gradient Descent, stochastic gradient descent, Optimization, Closed Form Vs Gradient Descent, evaluation using Plot cost vs Time, Learning rate, Rescale inputs, few passes and Plot mean Cost. YouTube reference link on each topic are covered for better conceptual clarity.
Challenging Capstone Projects
That Will Enhance Your Resume
Data about a employee is given (experience, degree etc ) and salary has to be predicted.
Data about the bike trip is given ( distance, temperature, wind, day etc ) and duration of the trip has to be predicted
The dataset consists of parameters such as the artist’s name and reputation, dimensions, material, and price of the collectible, shipping details such as the customer information, scheduled dispatch, delivery dates, and so on. Task is to predicts the cost of shipping paintings, antiques, sculptures, and other collectibles to customers based on the information provided in the dataset.
Data about the merchandise is given (store presence, category, time etc) and popularity of the merchandise has to be predicted
Data of 6 premier league team ( Arsenal, Liverpool, Mancity, Manutd, Chelsea, Tottenham ) are given for the last 15 years and EDA has to be performed on it
Steam is a marketplace for games and social network for gamers on which this dataset is based. Dataset consists of collection of user behaviours such as purchase and play, with columns: user-id, game-title, behaviour-name and value which indicates the degree to which the behaviour is performed. By using this information we can recommend more similar games which are in liking of the user.
The sentiment of financial news articles reflects and directs the performance of the U.S. stock market. By performing sentiment analysis on the news headlines, we get the label of positive or negative with their confidence scores. By using this output we can correlate to the stock market’s gains/losses on that particular day.
By scraping data from this website and using that data to develop our own dashboard which will update automatically by continuously fetching data after some fixed interval.
What is the method of learning ?
The course is designed in the form of Assignments. In Assignments we have topic wise You tube video links followed by related questions. You are supposed to code the solutions. If you are stuck somewhere then you can reach to our mentors. They are available from 10 A.M to 10 P.M everyday. After completion you need to submit the assignment for evaluation and move to next assignment.
How the Data Science projects look like?
The Data Science projects are present in the form of guided assignments. Assignments have instructions and you have to write the code based on these instructions. It is end to end implementation from Data cleaning, Data Preparation, Feature Engineering, Feature Selection, Model building, Hyperparameter Tuning, Model Selection, Model Evaluation to Model deployment on localhost and Cloud.
Is there any pre-reuisites required ?
There are no pre-requisites as such. You can join even if you don’t have any prior coding experience.
What People are saying about us on social media
CloudyML Course Completion Certificate Sample
Meet the course designer.
Hello, I'm Akash.
I'm a Data Scientist.
I have transitioned my career from Manual Tester to Data Scientist by upskilling myself on my own from various online resources and doing lots of Hands-on practice. For internal switch I sent around 150 mails to different project managers, interviewed in 20 and got selected in 10 projects.
When it came to changing company I put papers with NO offers in hand. And in the notice period I struggled to get a job. First 2 months were very difficult but in the last month things started changing miraculously.
I attended 40+ interviews in span of 3 months with the help of Naukri and LinkedIn profile Optimizations and got offer by 8 companies.
Now I want to use my experience to help people upskill themselves and land their dream Data Science jobs!
No. There is no placement as part of the course. The course focuses on skill development for being able to clear Data science interviews.
This course has been designed keeping beginners in mind. You will be able to learn as we start from basics. We have many learners doing this course who had no prior coding experience.
It’s one time payment of INR 999 and you will get lifetime access to this course experience.
Our mentors team comprises of M.tech and B.tech students with good technical skill to resolve your course and assignments related queries