Goroutines

Go was designed with concurrency in mind, with concurrently executing functions called goroutines. They are lightweight threads managed by the Go runtime – not to be confused with threads running at the level of the operating system. In this post I’ll give a brief overview how they work and how we can use them to do some concurrent data fetching.

What is concurrency?

Concurrency is not parallelism. When running software concurrently, it does not do multiple things at the same time. Rather concurrent programming allows to switch between concurrently running functions (often called coroutines). This is incredibly useful for I/O-bound functions that are waiting for something outside of their program. For example, when making a request to an API, a function will likely be waiting for the response for most of its total execution time. If we run this function concurrently, we can switch over to other functions while we are waiting for the API response to arrive.

Setting up an example

Lets keep with the example of making multiple HTTP requests concurrently. Let’s say we have a RESTful API for our todos, and we want to calculate how many todos are already completed of a given set of todos. So all we need is a list of IDs for our todo items, and we can create the corresponding resource URLs:

todoIDs := []int{1, 42, 84, 99, 155, 81, 22, 98, 102, 7, 2, 3}

func generateTodoUrl(id int) string {
    return fmt.Sprintf("https://jsonplaceholder.typicode.com/todos/%d", id)
}

urls := []string{}
for _, todoID := range todoIDs {
    url := generateTodoUrl(todoID)
    urls = append(urls, url)
}

Having a look at an example response from the todo API, we find the following JSON in the body of the response:

{
  "userId": 1,
  "id": 1,
  "title": "delectus aut autem",
  "completed": false
}

We can create a corresponding Todo struct, containing the fields we want from the response body:

type Todo struct {
	ID        int    `json:"id"`
	Title     string `json:"title"`
	Completed bool   `json:"completed"`
}

With this we can now implement the function fetchTodo, which fetches the resource and parses the response body into our Todo struct.

func fetchTodo(url string) (Todo, error) {
	log.Printf("GET %s", url)
	resp, err := http.Get(url)
	if err != nil {
		return Todo{}, fmt.Errorf("failed to fetch data: %v", err)
	}
	defer resp.Body.Close()

	body, err := io.ReadAll(resp.Body)
	if err != nil {
		return Todo{}, fmt.Errorf("failed to read response body: %v", err)
	}

	var todo Todo
	err = json.Unmarshal(body, &todo)
	if err != nil {
		return Todo{}, fmt.Errorf("failed to unmarshal JSON: %v", err)
	}

	return todo, nil
}

So far nothing here had anything to do with goroutines. We could now just loop over our resource URLs to fetch all the todos procedually.

var completedTodos int

for _, url := range urls {
    // we have to wait for a request to complete before sending out the next
    todo, err := fetchTodo(url)
    if err != nil {
        log.Println("Error fetching todo: %s", err)
        continue
    }
    if todo.completed {
        completedTodos++
    }
}

But that’s slow and inefficient. To make these requests concurrently, we need two things:

A way to run our fetchTodo function concurrently. For this we can wrap it in an anonymous goroutine.
Receive the response data of the requests concurrently. For this we can use channels.

Let’s tackle the second part first.

Channels

In Go, channels are typed things that can we can send data into, and receive data from. It’s the canonical way to communicate data from concurrently running functions – goroutines. So essentially it lets one goroutine send data to another goroutine. One important thing to note is that the main function in Go is also a goroutine.

We can create a channel using make(chan <channelType>). To send data to and receive it from a channel we use the <- operator, with the direction of arrow indicating how the data flows. These two operations are called communications.

channel := make(chan int)

// send a value into a channel
channel <- 42

// receive a value from a channel
value := <- channel

Apart form send and receive, a channel also supports the close operation, which will make the channel stop receive any more values. If we try to send a value to a closed channel, it will panic. But we can still receive from closed channels until they are empty. Any subsequent attempt to receive will return the zero value of the channel’s element type.

Making our program concurrent

So now we can use a channel to collect the data from our goroutines. We give our channel the element type Result, which will contain either the received todo data, or an error. We also know exactly the amount of requests we are making, and thus the number of results we want to receive with our channel, so we can use it to create a buffered channel.

type Result struct {
	Data Todo
	Err  error
}

resultsChan := make(chan Result, len(urls))

Now that we have our channel set up, we can finally start with the bloody goroutines! Here we can very simply wrap our fetchTodo function call in an anonymous function and prepend the go statement, which will run the prepended function as concurrently.

for _, url := range urls {
    go func(url string) {
        data, err := fetchTodo(url)
        resultsChan <- Result{Data: data, Err: err}
        log.Printf("FETCHED %s", url)
    }(url)
}

The above will start up a goroutine for every url we want to request. Once the fetchTodo function has received and processed the response, we send the results into our resultsChan. Then all that’s left is to collect all the results from the channel and do something with them – like counting completed todos and printing the result.

var completedTodos int

for range urls {
    result := <-resultsChan
    if result.Err != nil {
        log.Printf("ERROR %s", result.Err)
    }
    if result.Data.Completed {
        completedTodos++
    }
}

fmt.Printf("completed %d/%d", completedTodos, len(todoIDs))

So that’s pretty much the basics of making Go execute functions concurrently.