Neural Networks with ScikitLearn


What is a Neural Network?


A lot of Machine Learning is inspired by how the human mind works. The concept of Neural Networks goes one step further - to take inspiration from the way Neurons are laid out in the human brain.
As you can see in the image above, the neurons get multiple inputs and based on these, give out one output that feeds into another set of neurons, and so on. The nervous system is built of many such neurons connected to each other. Each neuron contributes to the decision process by appropriately forwarding the input signal - based on the training it has gathered. Such a model has the potential to hold all that a human brain does. Each neuron has a minimal functionality that can potentially do wonders.
From the mathematical point of view, there is a limit to what a linear function can do and it was noticed that results of polynomials were not good enough to justify the computational expense. Hence the concept of neurons picked up momentum. Neurons are implemented as linear function with a non linear topping - called the activation function. Thus, each neuron is defined by weights for each input and a bias. The result of this operation is fed into the activation function. The final output is the input for the next set of neurons. Such Artificial neuron is called a Perceptron.
Often the network has multiple layers of such Perceptrons. That is called MLP (Multy Layer Perceptron). In an MLP, we have an input layer, an output layer and zero or more hidden layers.
Each Perceptron has an array of inputs and an array of weights that are multiplied with the inputs to generate a scalar. This processing is linear - they cannot help fitting a non linear curve - irrespective of the depth of the network. If the network has to fit non linear curves, we need some non linear element on each perceptron. Hence, perceptrons are tipped with a non linear activation function. This could be a sigmoid or tanh or relu ... Researchers have offered several activation functions that have specific advantages.
With everything in place, a neural network looks like this:
The layout, width and depth of the network is one of the most interesting topics of research. Experts have developed different kinds of networks for different kinds of problems. The deeper and larger the network, more is its capacity. Human brain has around 100 billion neurons. Neural Networks are nowhere near that - some researchers quote experiments with million. This concept of large neural networks or Deep Learning is not new. But it was limited to mathematical curiosity and research papers. The recent boom in the availability of massive training data and computing power has made it a big success.
But even small networks with very few neurons are capable of performing some minor tasks. Let us look at one such task implemented with ScikitLearn.
Training and tuning Neural Networks is a massive subject and deserves many blogs dedicated to each topic.
Machine Learning Specialization from University of Washington

Implementation

Scikit Learn is provides a basic implementation for the Neural Networks. In order to implement it in Python, we start with importing the libraries
from sklearn.datasets import load_breast_cancer
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import train_test_split
Next, load the data set to train and test the network. As in the previous examples, we can use the dataset of breast cancer.
cancer = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(cancer.data, cancer.target, stratify=cancer.target, random_state=42)
Now we can create an instance of the classifier and train it
clf = MLPClassifier(solver='lbfgs', alpha=1e-5, hidden_layer_sizes=(5, 2), random_state=1)
clf.fit(X_train, y_train)
MLPClassifier(activation='relu', alpha=1e-05, batch_size='auto', beta_1=0.9,
       beta_2=0.999, early_stopping=False, epsilon=1e-08,
       hidden_layer_sizes=(5, 2), learning_rate='constant',
       learning_rate_init=0.001, max_iter=200, momentum=0.9,
       nesterovs_momentum=True, power_t=0.5, random_state=1, shuffle=True,
       solver='lbfgs', tol=0.0001, validation_fraction=0.1, verbose=False,
       warm_start=False)
With this training, we can check how well our model works
print('Accuracy on the training subset: {:.3f}'.format(clf.score(X_train, y_train)))
Accuracy on the training subset: 0.939
print('Accuracy on the training subset: {:.3f}'.format(clf.score(X_test, y_test)))
Accuracy on the training subset: 0.944
That is a decent performance.
ScikitLearn was good to demonstrate small examples. But it does not perform well enough to be used in bigger Neural Networks. Tensorflow and other such libraries are used to handle Deep Neural Networks.