Deep Learning Demystified

AI and Machine Learning have already stormed the industry with interesting use cases. By the time we realized the immense uses for machine learning, we are hearing about deep learning. So, what is deep learning? Is it just more advanced machine learning or something else?

Deep learning is a subset of Machine Learning that mimes the working of a human brain using neurons.

How does it work?

With Deep Learning the focus is on building Artificial Neural Networks (ANN) using several hidden layers.

A deep neural network has three types of layers:
1. An input layer: A input or stream of data points
2. Hidden layers: Processing nodes that are interconnected with the input. A deep neural network has more than two hidden layers
3. An output layer: A node that transforms the processed information into usable output

Neural networks work on pattern recognition starting with simple and moving to complex. They learn simple features in the first layers of the net. Some nodes are activated based on defined thresholds. These activated nodes input into the subsequent layers of the network. In the following layers, it combines those features to derive other sophisticated features. This process continues until it computes the final output in the output layer.

The levers that are used to fine tune the neural network are different weights, biases, number of neurons, activation functions (Relu, Sigmoid etc.) and optimizers (SGD, Adam) etc. With this the Deep Neural Networks tries to come out with the best possible outcome for the given problem.

For e.g. in the figure above of an image recognition use case, the network consists of a hierarch of layers, whereby each layer transforms the input data into more abstract representations (e.g. edge -> nose -> face). The output layer combines those features to make a right prediction.

Tools & Platforms

Python combined with TensorFlow is the most common stack for deep learning. TFlearn is a high-level framework, the syntax is easy to understand and clean. Another common framework is Keras which is more robust. These are both high-level frameworks that run on-top of TensorFlow.

Simple neural networks are easy to run on computer’s CPU. Most experiments may take several hours or even weeks to run. That’s why majority of the users prefer deep learning on the cloud-based services through modern GPUs.

Google Colaboratory is a Google research project created to enable machine learning education and research. It’s a Jupyter notebook environment and requires almost no setup for usage and runs on the cloud. These notebooks are stored in Google Drive and can be easily shared into Google Docs or Sheets format. Training your model on a GPU can give you speedy expansions near to 40x, taking 2 days, turning into few hours. The best part is Colaboratory is free to use and you get an unlimited supply of 12 hours of continuous access to a k80 GPU.

How deep is Deep Learning?

Are there real-life use cases for deep learning or is it just a scholarly thesis topic? You will be surprised to know how deeply established it is!!!
Since the processing power needed for Deep learning is readily becoming accessible using GPUs, Distributed Computing and robust CPUs etc. we see a rapid surge in adoption of Deep Learning. As the data volume grows, Deep Learning models seem to outperform Machine Learning models. The below image from the Guru of the industry Andrew Ng gives a great perspective on Deep Learning and its growing importance!

Where to use Deep Learning?

Due to our cognitive abilities, we humans are good in identifying the patterns in images and classify specific objects. Be it processing languages, understanding them or acquiring the intents, we master it.

Deep learning frameworks are proving to be efficient to carry out tasks that humans excel at such as image recognition, speech translation, and recognition. Below are the areas where deep learning techniques have seen a great usage:

Computer Vision for gaining understanding of images. E.g. scanning images for specific patters or reading text
Object recognition: identifying or classifying objects in images or video streams. E.g. Cancer detections, Google Photos/Lens
Face recognition: recognition of faces in an image or video stream. E.g. tagging of people in Facebook
Natural Language Processing: techniques for synthesis and analysis of natural language and speech
Speech recognition: recognition of human speech
Speech to text/Text to speech conversion: converting a speech to text and vice versa.
Entity-Intent recognition: recognition of intents and entities in a conversation or text. e.g. Chatbots
Video Analytics – combining computer vision, object and face recognition for video analytics such as scene analytics for OTT app videos

What is the future?

The future of AI which combines machine learning and deep learning continues to be to build an Artificial Generalized Intelligence holistically mimicking human intelligence aspects i.e.
• Understanding Human Language
• Perceiving the world
• Navigating and moving in the world
• Logical Reasoning
• Moral Reasoning
• Emotional Intelligence

To summarize, deep learning techniques are helping us make long and fast strides to the end goal!! Very excited to see this happen!!

Loading Likes...