What exactly is deep learning, and how does the magic happen?

Written by abyshake | Published 2017/10/25
Tech Story Tags: artificial-intelligence | machine-learning | startup | technology | business

TLDRvia the TL;DR App

A simple quick-dive guide — for everyone.

AI, Machine Learning, Chatbots, NLP — all the buzzwords these days. From developers to analysts to business owners, everyone wants their business to benefit from the magic of AI they have all been hearing so much about. It has come to a point where developers building even a simple computer program have started calling it ‘Artificial Intelligence’. (I recently came across a ‘non-traditional’ chess game which was dubbed as AI, but to be honest, I have seen better chess algos in the 90s.)

This is what it has come to. AI and Machine Learning have become terms that are thrown around in casual conversations these days. Every single business executive wants to incorporate AI in their business processes, but as it happens with almost anything that becomes an ‘in thing’, many of these people are not able to comprehend what AI or ML really is — not fully at least.

FIRST THINGS FIRST, I AM NOT A DATA SCIENTIST

So why the hell am I talking about data science and machine learning? And what makes me think that I am **even qualified** to talk about these things I obviously do not have an expertise in?

Actually, it is my relative unfamiliarity with the topic that makes me the best person to talk about it. Since it gives me the unique perspective to talk about it from the point of view of someone who understands business processes, but has just started understanding data science. If you are an expert in data science, this story might not be for you; but if you — like so many out there — are interested in improving your understanding of AI and machine learning, well — Do read on.

Off late, I have been working quite closely with a group of amazing data scientists, and so as to make myself a useful participant in the discussions we have (and more importantly, to avoid making a complete ass of myself), I needed to understand things better. I needed to get to a point where at least I understood things and how they happen. And that is what I have been doing.

WHAT IS MACHINE LEARNING, DEEP LEARNING AND AI?

The first thing you need to understand is when you see people talking of Machine Learning, more often than not, they are talking of Deep Learning.

This is what Google displays in a card on Deep Learning — Wiki

But, you aren’t interested in Deep Learning, are you? You aren’t even that much interested in Machine Learning. You are interested in AI. Right?

Wrong! That is the first thing most of us get wrong. You are always interested in machine learning; that is what will make your artificial intelligence intelligent.

AI or Artificial Intelligence is a replication of human intelligence or behavior in machines. It has been there for decades. Remember those chess games you used to play vs computer as a kid? That IS AI. The program is fed rules and scenarios and trained to take specific actions based on different triggers.

The more data and scenarios you feed into the program, the better your AI would get in reacting to triggers.

Machine Learning, on the other hand, enables the programs to learn on their own. The machine is now able to come up with its own set of actions based on an understanding and analysis of huge data sets; no longer dependent on hard coded rules and scenarios traditionally fed.

SO HOW DOES IT HAPPEN?

Let us get back to the chess game. How exactly were the first chess programs written? Let us start with a simple 3x3 grid.

Source for chess images — A step-by-step guide to building a simple chess AI

This is how the first chess games were built. Every possible move leads to a possibility of more moves, with obvious good or bad moves. Here comes the kicker. The computer doesn’t yet know what qualifies as a good/bad move. You — the programmer — need to teach it that. Once you have done that — by feeding in various possible scenarios and assigning weightages to different moves — the computer is ready to play. More scenarios you feed into the system, better equipped it would be at gameplay.

However, I must admit, in that image, a couple of things do not make perfect sense to me. For example, bottom row, third image from the left. It can't or at least shouldn't have the same score as the first two since in that particular scenario, the black rook is at obvious risk. Fifth from the left - same.

But, to drive my point home, that image serves its purpose, so let us ignore the technicalities here. I am sure there is a logic behind it. I have to admit, I didn't read it much; not a techie.

For even a simple first move, there exist multiple scenarios.

There would be two more moves for the knight on the right.

That is how you teach the computer different scenarios to expect during gameplay.

Now, you can add in a further layer of complexity and teach the computer to anticipate your countermoves based on the computer’s moves. The number of possibilities increases exponentially with each next level you want the computer to anticipate, and weightage of different moves will also change.

So, in this particular case, the computer was trained using specific data sets, each of which had inputs and expected and probable outputs. For a good period of time, this is exactly how different AI programs were trained.

Deep learning will work differently. It will simultaneously be both more simplified and more complex. It would learn from data sets with no pre-defined or specific structure.

A most rudimentary example in this case would be a game of chess designed in such a way that the computer can come up with possible moves for each piece independently on its own — for every possible scenario of the gameplay — and access and assign favorable scores to different moves. When you train an AI in this fashion, you are enabling the AI in making logical classifications of the data.

DIVING A BIT DEEPER

Before we talk more on deep learning, let’s take a moment to understand neural network or artificial neural network.

Without getting into the technical details, think of it this way. Everything YOU see and do — ranging from identifying images, differentiating between colors and sounds, processing speech and text — is attributed to neurons. They take inputs and pass along outputs. Artificial neural network is essentially replicating that same behavior in a computer program.

This is how the neural network would look like for a deep learning algorithm.

To put it more in context, imagine a face recognition algorithm.

Instead of you training your program using individual input parameters, you built it to identify specific parameters and analyze data on those to come up with desired results. (That’s slightly oversimplifying things, but in layman’s language, that’s exactly what it is.)

LET’S TAKE AN EXAMPLE

I used to work with a fashion e-retailer, so let’s pick that. Consider an AI that uses deep learning to suggest optimal pricing points for products. (Yes, discounts!!)

So what input variables would I include in my algo? Possibly:

  • Visibility : How much exposure are products getting? (For sub-par visibility items, discounting is pointless. Fix visibility first.)
  • Sizes : Pivotal sizes or off sizes? (E.g. for men t-shirts M, L, XL are pivotal sizes, S, XXL off sizes)
  • Sell thru rate : Historical data?
  • Category
  • Brand
  • …..

For obvious reasons, more parameters you add to the mix, more complex your algo will get, but more robust as well. For now, let’s live with just 5.

This is an example of what the neural network will look like — first column represents input variables, and visualization on right represents final output.

Focus on just the five columns in between. These are the hidden layers.

Notice the number of connections between different neurons? Something that’s noteworthy here would be the ‘weight’ assigned to connections between various neurons. These weights represent importance of the input value. Initially, weights are randomly assigned (or assigned based on your understanding of the criticality of various parameters — aka random).

We would skip on Activation function. What does it do? Standardizes output from each neuron, making the model more accurate.

NOW COMES THE DEEP LEARNING. ACTUALLY NOW COMES TRAINING YOUR AI

This is when the actual work starts to happen. To train your AI and to make it precise, a large data set is needed. In our case, historical data of pricing patterns and the correlation they have with different parameters we chose.

Long story cut short, the AI would be trained based on historical input data and comparing the output from your algorithm vs actual output witnessed (historically). The AI would give wrong results to begin with, and that is when the ‘weights’ associated to different neural connections start getting optimised. Your algo automatically takes care of this process, and once done, you’re left with a optimal price suggestion engine that you can accurately use on your website.

Interesting, wasn’t it? (Even if the story wasn’t, the business implications surely would have been). The more I look at it, the more it fascinates me. And it’s not just data from your business, even demographics data, social media data — it all adds up. The way you can use seemingly irrelevant and non-contextual data in driving impact to your business — who wouldn’t love that?

Btw, if you got overwhelmed there, two things:

  1. I actually skipped over a bunch of technical terminologies — Activation Function, Loss Function, Gradient Descent etc. To keep things simple.
  2. With more and more out of shelf deep learning algos released by tech majors, the dependency deep learning AI systems have on having access to humongous data sets before they can work is reducing with every passing day. (You’ll still need historical data to improve, but the ‘massive data’ dependency is reducing).

Another quick simplified pictorial example

Also, if you’re a machine learning expert feeling repulsed by the shabby presentation, my apologies. Just trying to help others understand this extremely complex but equally beautiful piece of technology in the simplest way possible.

That’s it for today; see you tomorrow!

I am Abhishek. I am here... there.... Everywhere...

Medium | Twitter | Facebook | Quora | LinkedIn | E-mail

Click here to get the best scoop on marketing and business strategy

If you want to learn deep learning and neural networks, Michael Nielsen has a free book.


Published by HackerNoon on 2017/10/25