Neural Networks — Introduction

Neural networks have taken centre stage and almost dominates the whole AI domain. Learning neural networks has become a necessity, especially due to it’s high performance and applicability to all fields. Neural networks are computer systems which are modelled based on the brains of animals. These systems learn to carry out tasks such as classification or prediction through an iterative process. Neural networks are able to outperform the traditional algorithms(SVM,RF) when the amount of data that we have increases. Now, what the hell is happening in a neural network? the most succinct answer would be, mathematical computations.

Let me take up an example problem to explicate the details of the function of a neural network. Consider the housing price problem, you want to find the price of a house given certain features about the house. For example, you have details such as the size of the house, locality and number of bedrooms and you are asked to provide an estimate of the price of the house. A neural network will be an apropos method to find out the price.

The above image describes how a neural network looks, you have three different types of layers, an input layer, hidden layers and an output layer. Each hidden layer can contain any number of neurons/nodes. The number of nodes in the input layer will be equal to the number of features present in our problem(in our case it is 3, viz. size, locality, number of bedrooms). The number of nodes in the output layer will be equal to the value we want to predict(in our case it is 1, viz. price of house).

Now, lets go one more level deeper and try to understand what is happening in each of these nodes.

In each node of the input layer, the input feature, weights and bias are taken as input and the output z is calculated as shown above. We take these values as matrices which makes the computation easier and faster. What are weights and bias? weights and bias are values that are randomly initialised first(from a gaussian distribution) and these values are used to compute the output of the node. It is by tuning these values(weights and biases) we are able to fit the neural network to our data.

After computing the value z, we use an activation function(ReLu, Sigmoid) on z. Activation functions are used to introduce some non-linearity to the model. If we do not apply any activation function, the output will only be a linear function and might not succeed in mapping complex inputs to outputs.

In the hidden layers, the input of the nodes are the output of the previous layer’s nodes. Finally, the output layer predicts a value and this value is compared to the known value(ground truth value) and a loss is computed. Intuitively, the loss represents how far off the predicted value is from the ground truth value. Using this loss value, the weights and biases in the network are changed to get the optimal value which ensures correct prediction of the output.

I believe this story would’ve provided some basic intuition on how a neural network works. Thank you.