Data Science Crash Course 8/10: Neural Networks
This is 8th instalment of Data Science Crash Course and we’re going to talk about Neural Networks and how to use them to classify data. It’s going to be a gentle introduction to neural networks as I’m assuming you have never used them.
What are neural networks
Neural Networks start with perceptrons, modeled on a single neuron in a human brain. You can think about it as a couple of functions composed one by one. You have the input data, then you apply an activation function — that can be a linear function, ReLu or a sigmoid function, or anything else — then you pass the data to the next node and again apply the activation function at a given node. It all goes in layers like that (this is more general feedforward network depicted):
So let’s start with a perceptron example. We have a look at how they are implemented with sklearn, but then we switch to Keras framework. Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow. TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML-powered applications.
Here’s a simple example with sklearn where we try to fit numbers using perceptron:
As you can see, code is short and simple, but details are in mathematics. At this point, you should go back to some linear algebra in order to understand what’s going on.
The following paragraph is more technical, but I tried to put all the terms together (with links so you can read more about each concept):
Formally a multilayer perceptron is a class of feedforward neural network. A feedforward neural network is a neural network wherein connections between the nodes do not form a cycle. Backpropagation is an algorithm in training feedforward neural networks for supervised learning. Backpropagation computes the gradient of the loss function with respect to the weights of the network for a single input/output example. In this context it’s also important to understand gradient descent, which is an algorithm used to find a local minimum of a function.
Let me note here that deep learning is machine learning but with neural networks of at least 3 layers of a neural network we use.
Neural Networks in Keras
Now let’s go back to examples. As I’ve mentioned Keras and Tensorflow are most prevalent machine learning frameworks currently and Keras is especially easy to pick up. It’s a bit like Lego when you build sequential models. Let’s have a look at this example to see again Multilayer Perceptron this time built with Keras:
I won’t get into details here, but what is happening here:
- We import Keras.
- We generate some random numbers to have some data to train on and then test on (X,y train/test).
- We build a sequential model but adding 3 layers. Two of the them have ReLu as an activation function and the last one have a sigmoid. We also use dropout but I won’t discuss that here, though it’s worth to read more about it.
- We then compile and fit the model to our train data.
- We evaluate the model on our test data.
All in all this is how neural networks and machine learning work in a nutshell:
- Set up the stage by importing packages, frameworks and data (both to train and test).
- Build a model.
- Fit the model to train data.
- Evaluate the model on test data.
- Tweak model’s parameters, tweak data, repeat.
As Data Science is practical and it’s all about testing and trying, now it’s your turn. Open your Jupyter Notebook and play with data. I also recommend reading about how to use Keras with MNIST dataset — here’s a great tutorial for that.
If you prefer a video version of this lecture, watch this: