Deep Learning for Layman
Today Deep learning is a buzz word like how data science and machine learning was yesterday. And it is no surprise that you would blow up with too information with too complicated terminologies and glossaries when you try to understand Deep learning with the materials available on online. This blog is written with an aspect of helping the layman to understand what is Deep learning without bringing complicated math and terminologies in first place. Some parts of this writing include fictional contents and it is for obvious reasons to teach Deep learning in much simpler way, however facts about the deep learning remains the same.
Deep learning is not new.
Deep learning is not relatively new a field, it has been around in history for a long time. Techniques used Deep learning was helping machines to learn and solve problems. Hence this field of study comes under machine learning science. However techniques used in this field was not providing impressive result and had obstacle that was considered to be unsolvable on those days. Hence Deep learning was unpopular and remained in darkness for quite some time.
But recent breakthroughs in research have overcome most of the obstacles with high quality result. I will cover details of what was the breakthrough and how things turned good for deep learning in another post, for now let’s keep things simple and understand the big picture.
Define Deep learning.
Deep learning is a science and study of applying neural networks (sometimes referred as deep neural network or artificial neural network) to solve complex problems in machine learning field.
What is Neural Network.
Back in history computer scientist wanted machine to think, learn and solve problems. They observed biological brain and nervous system to achieve this. They found a biological component in the brain called Neuron is responsible for human intelligence, hence by keeping the biological brain as the inspiration they developed a concept Artificial Neuron for machines to solve problems.
Biological Neuron components.
Biological Neuron components.
Dendrites - A pipeline that carries input signals
Nucleus - Consider this as a function that manipulate the input and decide what should be the output and where the output need to be sent.
Axon - A pipeline that carries output signals
Axon ending - Connecting point to another Neuron dendrites.
Biological Neuron vs Artificial Neuron
Artificial Neuron components.
Input - A pipeline that carries input.
Function - Math function that does take the input, applies user defined logic and send the output.
Output - A pipeline that carries the output.
The human brain has billions of Neurons interconnected ( referred as a biological neural network) and they talk to each other by sending and receiving electrical impulses which is responsible for making human to think and have consciousness. This network of communication has been mimicked with Artificial Neurons giving birth to something called Artificial Neural Network (ANN).
A novice user may now think, the concept or theory above ANN is fine, but how in real world ANN used to solve problems.
First let’s see what are the real word problems. For example real world problem for a machine could be - image classification, object detection, handwriting detecting, text to speech processing, speech to text processing, learning to play a chess game, learning to diagnose a health problems, driving a car or even responding to your joke.
To crake on all these problems researchers started to think from biological brain preceptive. What they found was interesting, biological brain does not work in the same way for every problem it encounters. i.e. when human see an image and recognise the detail only some set of Neurons in neural network started functioning and they communicated to solve the puzzle. Whereas when driving a car totally differently different set of Neurons started functioning and they communicated to each other in a totally different way.
This helped researchers to understand the fact that there are many different neural network in biological system which gets functional based the problem it encounters.
Following the same fact researcher concluded that there is no one Artificial Neural Network need for solving all the problems. But we could artificially create different type Neural Network for solving different problems. i.e. a specific neural network for Image processing, a specific neural network for driving a car and etc. In day to day practices all these specifically created neural networks are collectively referred as Artificial Neural Networks (ANNs) once again.
Understanding the internals of Artificial Neuron
Most simplest Neuron in a network may look something like shown in the diagram. It will have input, output, math function (aka activation function), weights (W) and optional basis (B). Weights W are nothing but user defined number which will be used in activation function for computation. Let’s say you want to solve classification problem. Given x1, x2, x3, x4 you want to find class Y. For doing this in neural network you will define a hypothetical math function. Let say you define a function shown in the diagram.
Since it is supervised problem you already know what should be the out for certain set of input. Now provide the inputs to the model it will give the output Y. If Y value is not the desired value that you expected then you are left out with two options. Either you can chance the hypothetical math function or change the weights so that output from the model will be close to the actual expected value. This process is called training the network.
In most of the cases rather than changing the math function you will adjust the weights in network to match with the actual expected value. One way of fining the suitable weights is by manually proving some arbitrary value for weight and then check the output value. Another systematic method of finding suitable values for weights is called Back propagation. But back propagation has its own disadvantage and we will not discuss that in this post.
Once the optimal weight are determined we use those weights in the network and we calculate the output. Now the output will be close the expected value. We will use the same weight for finding unknown Y.
Note: In the above example we have defined arbitrary activation function. But in most popular ANNs you see functions like Sigmoid, TanH, ReLU used as activation function inside the networks. There is no criteria to say what function should be used for what problem. Selection of activation function is purely output driven.
Similarly using the same approach as shown in the above we can connect output one neutron to another neutron for solving much more for complex problem. Next let’s see what are the ANNs available today and what type of problems it tries to solve. As of this writing there are many varieties of ANN available. We will cover only the basic and popular ANNs in this post.
Perceptron is one the most simplest form of Artificial Neural Networks. We call an Artificial Neural Networks as Perceptron when only one neutron layer is used in the network as shown in below diagram. Perceptron can be used for trivial classification problem. However Perceptron are only consider to be baby step for learning and build for complex network.
Multi Layer Perceptron (MLP) .
When multiple layer is involved in the network we call that Multi Layer Perceptron. In other words multiple Perceptron combined to form MLP refer diagram below.
Evolutions of Multi-Layer perceptron gave birth to main category of ANN. This categorisation is based on the property how MLP internally communicate among the network.
1. Feed forward networks.
2. Recurrent or feedback networks.
When you design an ANN and when you allow inputs to travel only in one direction. i.e. from input to hidden layers if any, then to output. Then this type of ANN falls into category called Feed Forward. The name feed forward because input signals are passed and processed only in one direction that too in forward direction.
Recurrent or feedback.
ANNs belonging to this category allow inputs to travel forward, backward or loop through as show in below diagram.
Epilogue : As we discussed many neural networks has been developed since the inception of MLP and today some of the widely used network are Convolution Neural Network, Auto encoder, Restricted Boltzmann Machine, Recursive Neural Tensor Network etc. In all these ANNs basic concept of Artificial Neutron remain the same however their internal network communication and architecture get differs to achieve their desired goal.
Final thoughts on selecting a right network.
If you are interested in finding patters from unlabelled data, then use Restricted Boltzmann Machine (RBM) or Auto encoders network. For text processing tasks like sentiment analysis, parsing, or named entity recognition use Recursive Neural Tensor Network (RNTN). For image processing use - Convolution Neural Network (CNN) or Deep Belief Network. For object recognition use convolution net Recursive Neural Tensor Network (RNTN). For speech recognition use Recurrent Neural Network (RNN). For any general classification problems use MLP with ReLu as activation function.