Hello! This will be the first of a general series of posts on neural networks, how neural nets work and frameworks for building neural nets.
What, How, Why?
I’d like to start off by questioning exactly why we even need neural networks in our lives, what a neural network is, and how we use them. To make things interesting, I’ll answer these in reverse order. We use neural networks every day! Your brain is a neural network, composed of nerve cells or ‘neurons’ that continuously fire impulses that cause you to perform actions. Sitting in a chair. Driving a car. Breathing. All these neurons are arranged together to form a network of nerves, or neural network. The brain’s neural net (which can be considered a combination of “feed-forward” neural networks) continuously learns from its results. Learning from past actions that work towards producing a desired result is what enforces neurons to produce results that duplicate the desired outcome in future. With multiple iterations of these neurons learning from repeated actions, they can be tuned to peak performance.
This is what gives neural networks the edge over other machine learning algorithms, such as Baiyesian techniques or Support Vector Machines; The ability to tune their parameters to achieve better results.
The same concept of neurons in the brain is used for creating artificial neural networks (ANNs). Neurons in an ANN are composed of a single unit called a “perceptron”. A perceptron feeds the signal produced by two inputs into an “activation function”. The input to the activation function is a result of 2 input variables multiplied by their respective weights.
For example in the figure, x1 and x2 are the inputs, with weights of W1 and W2 respectively. The value passed to the activation function f(x) would be x1W1 + x2W2. Here f(x) is an arbitrary function that is usually a “sigmoid” function. A sigmoid function is a mathematical function having an “S”-shaped curve or sigmoid curve.
Let us assume that x1=1, x2=0 = 0, W1=0.7, W2=0.5, with the activation function being the logistic function (https://en.wikipedia.org/wiki/Logistic_function) 11+e-x, we get:
y = f(x1W1 + x2W2) = f( 10.7+00.5) = f(0.7) = 0.668
A neural network consists of multiple “hidden layers”, layers of interconnected neurons. Input layers collect input patterns and pass them through perceptrons to the output layer. The output of the output layer is mapped to an action or signal that is of importance. For example, the input layers could be a set of details on precipitation, temperature and humidity on a certain day. Each of these inputs carry a random weightage. By applying a function to them, the outputs can be classified as one of “cloudy”, “sunny” or “rainy”. Hidden layers allow the neural network to tune the input weights until its error rate is minimized.
Neural networks are used in algorithmic trading, weather forecasting, anomaly detection in healthcare, credit card fraud detection and performance tuning.
Next, we’ll look at how we can go about building a neural network on our own.
About the author:
I’m Ajay, and I’m a full-stack developer at VMware. I work out of Bangalore in India. I’ve spent most of my career building web applications and designing microservices that scale, and I’ve always had a keen interest in machine learning, data science and deep learning.