- Basics of machine learning
- What is deep learning
An artificial neural network(ANN) is a machine learning model inspired by biological neural network, a.k.a the human/animal brain.
A perceptron is (as some of you might know) the fundamental building block of a neural network.
The approach is really simple . The first thing we do is multiply each of these inputs with their corresponding weights. so x1 multiplies with w1, x2 multiplies with w2 and so on. Once we get all the m products, we find their sum. and once we get this single number(the sum), we pass it through something called an activation function(most of the time non-linear function) to get the output of our perceptron.
Now this isn’t entirely correct, after we find the sum, we actually add a bias term too. this basically allows you to adjust your output regardless of your input.
Now this is our output for a single perceptron.
Now simply adding another perceptron , we get a multi output perceptron. Now here in this image, we can see we are getting two outputs. each output node, is going to have its own set of weights and a bias, but the same activation function, what that means is than each connecting line will be having unique numerical values.
Multilayer neural network(actual ANNs)
Now if we assume these outputs of a multi-output perceptron, as new inputs and repeat the process, we get a multilayer neural net(also called dense layer).
Remember, here I just showed you ANN with one single hidden layer. you can have multiple, like this with different number of nodes too.
You might also have multiple output at last, it really just depends on your requirement. Also I am mentioning again, each layer has only one kind of activation function for it’s nodes. So layer 1(a.k.a hidden layer 1) might have a log function, while the second might have an exponential function.
Training the model
Now Training an ANN is not very different from how you do it with other models. You take your output with randomly chosen parameters(weights and biases), calculate the loss and try to optimize our parameters with methods like SGD(gradient descent). Now training an ANN might be a little bit tricky, as it requires some funny mathematical properties like chain rule. But you don’t need to worry about that , since when you’ll be training a neural network in real life, most of these things would be handled by frameworks like pytorch.
One big question you might be having about neural networks is that ‘why does it help?’ or ‘why neural networks must be used ?’. Well in my opinion, the answer to this question is ‘flexibility’. Neural Networks give you a lot of options(parameters) which decide how our outputs should be. So basically it’s like before coming to a conclusion, you think about that a lot. Isn’t taking enough time to think about your decisions always good ?(And if you think a lot a lot, that causes overthinking, which many of you might have started linking with overfitting, well if you did, that’s great, having a lot of layers in ANN is in fact not always good).
Now it may not sound very intuitive or practical to you, but believe me, as you go forward learning more about ANNs you will understand its importance (: