Getting started with Machine Learning in 5 minutes

Learn Machine Learning by building home price prediction in 15 lines of code.

This blog post was originally published on HP Developers Portal (hp.io).

Machine Learning is here to stay. Whether it will lead to singularity is still speculative, however, will it change the way software engineers solve problems should not be questioned. It is already being implemented by many companies to create richer UX like with Apple ARKit (cool examples at madewitharkit.com) and answer complex questions like with Amazon Echo. Here at HP, we use it to answer complex challenges we face around 3D printing to part performance optimization and to even print customized footware. Machine Learning is a very powerful technique that problem solvers must understand now — not next year.

Applied Machine Learning

Machine Learning has a lot of moving parts. In this post, I will demystify ML by helping you write working code. And I will discuss what you can do to take the next step.

First consider how software was written before the advent of Machine Learning. Software engineers gave computers step-by-step instructions to solve a specific problem.

Let’s use an example from the banking industry. Say we wanted to write a program that would predict whether a loan applicant was likely to pay back their loan. We could write a program that analyzes that individual’s profile, setting parameters for key variables:

Credit score
Loan amount
Type of loan
Length of membership

The programming logic would look like this:

Fig 1. Simple bank loan approval programmatic logic

Based on the complexity of a problem, these manual steps of tweaking parameters and writing instructions may become complex or even impossible. Imagine the complexity of coding for object recognition.

Now a machine can learn to solve problems just by looking at examples.

With Machine Learning, the engineer can train an ML model that learns from loan data from thousands of borrowers. The ML model can keep improving over time, responding to more data and new trends. For example, after the Equifax security breach in 2017, credit scores from Equifax may not be as valuable as scores from the other two credit reporting agencies. If this is reflected with real loan outcomes, the ML model would tweak its parameters to give less weight to Equifax credit scores. With enough data, the ML model will train itself to find optimum parameters.

This technique is called supervised learning. We will use it in our tutorial below. (The other popular techniques are unsupervised learning and reinforcement learning.)

Fig 2. Simple bank loan approval ML logic

Home Price Prediction Tutorial

The fastest way to learn about Machine Learning is to build a learning machine, so let’s build our own Home Price Predictor. Let’s say each house has a base value of $240k and each bedroom adds $15k to the price. (We use “k” to abbreviate 1,000 to keep the numbers small.)

Bedrooms	Home Price
0	$240k
1	$255k
2	$270k
3	$285k
4	$300k

Predicting a house price requires a simple linear model (y = mx + b). We can use this formula:

Price of house = ($15k * # of bedrooms) + $240k

Now let’s build a Machine Learning program to do this. Using the training data, we want our model to figure out the values of **m** and **b**, which we know will be 15 and 240, respectively.

We will be using python for our code. Create a new python file with the code below and name it **home_price.py**. In the code, we start by importing the library, setting up some initial variables, linear model and loss function. Consider installing Docker and using the following Docker Command if you don’t have your environment setup.

docker run -it gcr.io/tensorflow/tensorflow:latest /bin/bash

Yes, without the print statements and comments it is just 15 lines of code. 🙌

In the code, we are setting up some basic placeholders and variables to be used during training. Then we are writing a loss function, which is calculated by subtracting y (the given value or ground truth) from the prediction value. We then pass that loss value to our optimizer. With each iteration, our optimizer will try to get the values of y and prediction as close as possible by updating the value of the variables m and b.

Next, we are training the model 1000 times with our training data. At the end you should get outputs which look like this

Value of m is [ 15.00007153] and value of b is [ 239.99978638].

What do you think of the values for m and b? Very close to our expected values, right? 💁

Below is the visualization of how the model is being optimized at each iteration. In the beginning values of m and b are starting at 1.0 (as we specified in our code), but over time they get to the right values. And we are also seeing our loss (prediction — y) decrease to 0 over time.

Fig 3. Values of _m_, _b_ and _loss_ over 100 iterations.

Q. How would you account for a neighborhood in your Home Price Predictor?

Q. How would you take home images into consideration?

Hopefully, the above tutorial helped you understand the basics of ML. Soon every full stack engineer will be using ML in their applications. We are not far from npm install object-detect.

If you would like to discuss more on ML or need help understanding if ML can be applied your problem, feel free to reach out to me. I’m happy to help.

Thanks for reading! Feel free to hit the 👏 button below and share if you found this piece interesting!

You can connect with me on Twitter or LinkedIn.