Neural Networks and Deep Learning

24. Neural Networks and Deep Learning#

Kriti Sehgal

In this Chapter, we will learn about Neural networks and Deep Learning, a very prominently used Machine learning (ML) technique these days. Even if these terms are new to you, you have likely heard them discussed frequently in today’s machine learning landscape.

Neural networks are powerful techniques capable of tackling large and complex ML tasks such as

Image Classification: Classifying objects in Images (e.g. Google Photos), detecting diseases from medical images
Speech Recognition: Virtual assistants like Siri and Alexa
Recommender systems: Recommending the best videos on YouTube to millions of users every day, recommending products on e-commerce sites such as Amazon, suggesting friends on social media sites
Text Generation: Generating human-like text in large language models such as ChatGPT, Gemini, and LLaMa.

At this point in the book, you have already built and used a neural network, yes really! As you learn about neural networks in this chapter, see if you can identify which technique from earlier chapters was actually a neural network in disguise.

Before we dive deeper into understanding neural networks, let us first explore why they have become so popular today. The fundamental building blocks of neural networks have existed for decades. This naturally raises a question: if these ideas have been around for so long, why are neural networks suddenly experiencing such an explosion of breakthroughs and applications?

Data: Neural networks thrive on large datasets and generally improves as the dataset grows larger. There is now a huge quantity of data available to train neural networks which has unlocked their potential to learn complex patterns of the world. The above mentioned applications, that were once considered impossible, are now routine because of the sheer amount of available data.
Advances in Computing Hardware: Even with large datasets, early neural networks were severely limited by hardware. Training neural networks on standard CPUs could take months or years. However, neural networks are highly parallelizable algorithms and the breakthorughs in the computing hardware that supports parallelization, especially GPUs (Graphics Processing Units) and TPUs(Tensor Processing units) have greatly benefitted by making neural networks of millions (or billions!) of parameters feasible to train in a reasonable amount of time.
Accessibility: Research in neural networks once required specialized knowledge of mathematics and low-level programming. Today, the development of high-level frameworks like TensorFlow, PyTorch, and Keras, along with cloud platforms such as AWS, Azure, and Google Cloud, has made it possible to build, train, and deploy these models in a streamlined and accessible way, allowing anyone to learn and use these technologies.