# Fundamentals of Deep Learning You Must Know

## HISTORY OF NEURAL NETWORKS AND DEEP LEARNING.

#### Neural Networks are an important Machine Learning Modelling framework

● First, the simplest Neural Network Model built was a Perceptron. Built-in 1957 by Rosenblatt

● With small changes, a perceptron can become a logistic regression

● After WW2, in the US people were trying to translate messages from Russian to English

● Contributors – Alan Turing(father of modern computing)

#### ○ Curiosity raised these questions:

■ What is intelligence?

** ■ **How should we build it artificially?

● There was a biological inspiration

● Due to work in NeuroScience, a vague understanding of the working of the brain has been developed

## Biological Neuron vs Artificial Neuron

A biological Neuron has Nucleus, Dendrites, and a cell body. When a Neuron gets electrical signals as inputs, the neuron does computations inside and it sends the processed electrical signals as output maybe to other neurons.

Some dendrites are thicker. This leads to more information for that input.

An Artificial Neuron has some inputs that are important, thus weights are included on the edges of the connections.

Output = f(W1x1 + W2x2 + W3x3 + ……)

Perceptron is this single neuron that is loosely inspired by the biological neuron and is not an exact replica but is powerful enough to solve many interesting problems. In biology, a neuron does not exist by itself. It is connected to other neurons.

## Complete Data Science Course With Projects

## REPRESENTATION: LOGISTIC REGRESSION AND PERCEPTRON

So, we all know about logistic regression basically logistic regression is finding the plane and separating positive data points from the negative data points, Let’s understand logistic regression from a neural network perspective.

In LR: given x_i predict y_i and Given a dataset: D = {x_i, y_i}, we find “W” and “b” while training LR.

and the y_i_hat = sigmoid(W.T*x_i + b)

If I want to represent and logistical regression hit the neurons we can do this thing.

Here the summation is nothing but it is the activation function, this activation function is changed for the different algorithms in the logistic regression here the activation function is the sigmoid function.

**Inputs **⇒ here the input is our data points and the number of inputs is the same as the dimension of our data.

**Weights** ⇒ Basically, the weight tells us how the feature is important, which means some weight values are high, that means this feature is more important to determine the out Y’s values, if the value of the weight is low then we can say this feature does not add much more information about the determining the Y’s values.

**Bias** ⇒ Generally in logistic regression, we find the plane such that they separate the -ve data points from the +ve data points, and the equation of the plane is W^T.X + b, where b is the bias or we also refer to as the intercept term of the plane or slope of the plane.

Activation function ⇒ activation can be thought of as the mathematical “gate” in between the input feeding the current neuron and its output going to the next layer or the output going to the prediction values.

Here activation function is the sigmoid function.

X_i = (x_i1, x_i2, ……, x_id) ⇒ our data points with d dimension/d-features.

W = [W1, W2, ….., Wd] ⇒ weight corresponding to each feature.

Input is ⇒ X_i * W

Output is ⇒ sigmoid(X_i * W) ⇒ sigmoid is activation function.

## Perceptron

In the Artificial neural networks, there are two types of the perceptrons single layer perceptron and multilayer perceptrons (MLP)

##### 1. Single layer perceptron

Simple Perceptron is look like this

The only difference between the logistic regression and the Perceptronis just the activation function and the activation function in the Perceptron is as follows

Perceptron is a very basic neural network with only one layer which means only one activation function.

F(x) = 1 if W^TXi + b > 0

Else 0

d

W^TXi + b ➝ ∑ WjXij + b if it is > 0 ⇒ return 1

j = 1 ⇒ else 0

Xij ⇒ ith data point corresponding the jth feature

Wj ⇒ weight corresponding to the jth feature

This equation is a simpler version of the above equation.

Perceptron is also a linear classifier, it means using the plane or hyperplane we separate our data points,

The difference between Perceptronand logistic regression is the loss function, In the logistic loss function through a sigmoid function and the Perceptron there is no sigmoid function, and the rest of the things are the same.

Train the Perceptron to use the Stochastic Gradient Descent (SGD) and the training means finding the weights.

##### 2. Multilayer perceptrons (MLP)

Since we have seen what the perception is and it is a very simplified model of a single neuron So what if I have a bunch of the connected neurons, this is nothing but the multilayer perceptrons.

**Why should we care about the MLP?**

The biological inspiration is that if a single neuron or perceptrons is powerful then the interconnected set of the neuron must be powerful.

In the MLP we have more than one layer and more than one neuron in the internal structure and the output of the first layer is an input of the second layer.

The linear model is the simplest model if our data is non-linear like a cosine graph to solve this type of problem we need a more powerful model and using the MLP we can solve the problem.

## Take A Look At Our Popular Data Science Course

Here We Offers You a Data Science Course With projects.Please take a look of Our Course