Goroutines: How to Run Concurrency Code in Go

One of the greatest strengths of the Go programming language is its built-in support for concurrency, based on Tony Hoare's “Communicating Sequential Processes”. As a developer with JS and Java background, I was surprised by how easily you can run a concurrency code in Go.

The difference between concurrency in Go and other languages

Well actually, goroutines are not a thread. They are green threads. Let us see what green thread is.

In computer programming, green threads or virtual threads are threads that are scheduled by a runtime library or virtual machine (VM) instead of natively by the underlying operating system (OS).

- Wikipedia

Green threads emulate multi-threaded environments without relying on any native OS abilities, and they are managed in user space instead of kernel space, enabling them to work in environments that do not have native thread support. Go scheduler is responsible for switching threads among goroutines. As a result, switching the context between green threads(goroutines in our case) is efficiently cheaper than os threads. The initial stack size of the goroutine is 2KB (and be shrunk) as opposed to ~8MB of a stack of OS thread.

To summarize go scheduler works inside go runtime inside user space and using OS threads. Goroutines are running in the context of OS threads.

Cooperative and preemptive

Until version 1.14 Go had only cooperative scheduling. It means that goroutine decides by itself when to free the resources for any reason(like call of the function any IO operations, waiting for a mutex, reading from the channel, and so on). And that may cause a problem of a single goroutine hogging CPU and does not reach any of the reasons above. So in 1.14 asynchronous preemption was introduced. Asynchronous preemption is triggered based on a time condition. When a goroutine is running for more than 10 seconds, the Go scheduler will try to preempt it.

Let’s take a look at how it works. First of all, how to create a goroutine?

To create a goroutine we need to use the keyword go like so:

go func() {
     //logic of concurrent function
}()

I will not dive into how pointers work in Go, but you can read about it here and watch this.

Our full example will be looks like so:

package main

import (
	"fmt"
	"runtime"
)

func main() {
	runtime.GOMAXPROCS(1)

	i := 0
	go func(i *int) {
		for {
			*i++
		}
	}(&i)
	
	runtime.Gosched()

	fmt.Println(i)
}

Try to run it with a version below 1.14 and then check the version above. With a version below the program will endlessly wait for the infinite loop to be finished. The version below will preempt resources from a goroutine that runs an infinite loop and print the value of i.

Channels

Sometimes we need a way to communicate between goroutines. In Go there is a special slogan:

Do not communicate by sharing memory; instead, share memory by communicating.

What does it mean? Working with concurrent programs is always not easy at all because you should always keep in mind race conditions, deadlocks, and other issues. Go introduces channels to handle this issue. Channel is a type of communication between goroutines. It has a type (int, string, some struct) and is created by a keyword make.

make(ch chan int)

To write or read something from channel there is a special syntax:

ch <- 2 // write
v := <- ch // read and assign result to variable v

Channel can be buffered or not. The difference is when goroutine will try to write to the buffered channel which has free space goroutine will be not blocked and execution will be continued.

You can also iterate through the channel.

for v := range ch {
   
}

As you may assume if there is no value in channel execution will be blocked until some goroutine will write a value to the thread.

You can also close the channel and as the result for loop will stop iteration over the closed channel.

close(ch)

Web crawler

As an example let’s create a simple function that will check the status of a website

package main

import (
	"fmt"
	"net/http"
)

func main() {
	websites := []string{
		"https://hackernoon.com/",
		"https://github.com/",
		"https://apple.com/",
		"https://google.com/",
		"https://youtube.com/",
		"https://www.udemy.com/",
		"https://netflix.com/",
		"https://www.coursera.org/",
		"https://facebook.com/",
		"https://microsoft.com",
		"https://wikipedia.org",
		"https://educative.io",
		"https://acloudguru.com",
	}

	for _, website := range websites {
		checkResource(website)
	}

}
func checkResource(website string) {
	if res, err := http.Get(website); err != nil {
		fmt.Println(website, "is down")

	} else {
		fmt.Printf("[%d] %s is up\n", res.StatusCode, website)
	}
}

If you will run this you will see such logs in the console:

[200] https://hackernoon.com/ is up [200] https://github.com/ is up [200] https://apple.com/ is up [200] https://google.com/ is up [200] https://youtube.com/ is up [200] https://www.udemy.com/ is up [200] https://netflix.com/ is up [200] https://www.coursera.org/ is up [200] https://facebook.com/ is up [200] https://microsoft.com is up [200] https://wikipedia.org is up [200] https://educative.io is up [200] https://acloudguru.com is up

Invoking this code will take about 10 seconds. The problem is of course because of simultaneously checking each resource one after another. Now let’s try to make it a little faster. To do that we will use a worker pool pattern. You’ll use a pool of goroutines to manage the concurrent work being performed. Using a for loop, you’ll create a certain number of worker goroutines as a resource pool. Then, in your main() “thread,” you’ll use a channel to provide work.

First of all, we need to define a worker for our case. It will looks like:

func worker(resources, results chan string) {
	for resource := range resources {
		if res, err := http.Get(resource); err != nil {
			results <- resource + " is down"
		} else {
			results <- fmt.Sprintf("[%d] %s is up", res.StatusCode, resource)
		}
	}
}

Let us quickly find out what exactly is happening here. Each worker will wait for a resource of the website from the channel resources and right after someone will push a resource URL to the channel worker will receive this URL and check if it is ok or not and push the result to another channel called results.

Now let’s see how we will run our worker pool:

func main() {
	websites := []string{
		//...
	}
	resources := make(chan string, 6)
	results := make(chan string)
	for i := 0; i < 6; i++ {
		go worker(resources, results)
	}
}

Our pool of workers contains 6 goroutines which are running and waiting for the resources to check. Here we can use as a separate goroutine IIF or Immediately Invoked Function:

go func() {
	for _, v := range websites {
		resources <- v
	}
}()

Why shouldn’t we use here a synchronous inline code? When, you may try to take final example removed go and you will catch a deadlock.

Now we not only have our worker pool but also provide them with a work :) As the last thing we need to do is to read the results from the pool. To do that we can iterate through results channel in main goroutine and print all results of checking each website:

for i := 0; i < len(websites); i++ {
	fmt.Println(<-results)
}

The full code will looks like:

package main

import (
	"fmt"
	"net/http"
)

func main() {
	websites := []string{
		"https://hackernoon.com/",
		"https://github.com/",
		"https://apple.com/",
		"https://google.com/",
		"https://youtube.com/",
		"https://www.udemy.com/",
		"https://netflix.com/",
		"https://www.coursera.org/",
		"https://facebook.com/",
		"https://microsoft.com",
		"https://wikipedia.org",
		"https://educative.io",
		"https://acloudguru.com",
	}
	resources := make(chan string, 6)
	results := make(chan string)
	for i := 0; i < 6; i++ {
		go worker(resources, results)
	}

	go func() {
		for _, v := range websites {
			resources <- v
		}
	}()
	
	for i := 0; i < len(websites); i++ {
		fmt.Println(<-results)
	}

}
func worker(resources, results chan string) {
	for resource := range resources {
		if res, err := http.Get(resource); err != nil {
			results <- resource + " is down"
		} else {
			results <- fmt.Sprintf("[%d] %s is up", res.StatusCode, resource)
		}
	}
}

If you run it you will see that it invokes much faster than the sequential version. You can also play with the number of goroutines and see how it will affect the speed of execution.