Go was designed with concurrency in mind, with concurrently executing functions called goroutines. They are lightweight threads managed by the Go runtime – not to be confused with threads running at the level of the operating system. In this post I’ll give a brief overview how they work and how we can use them to do some concurrent data fetching.
What is concurrency?
Concurrency is not parallelism. When running software concurrently, it does not do multiple things at the same time. Rather concurrent programming allows to switch between concurrently running functions (often called coroutines). This is incredibly useful for I/O-bound functions that are waiting for something outside of their program. For example, when making a request to an API, a function will likely be waiting for the response for most of its total execution time. If we run this function concurrently, we can switch over to other functions while we are waiting for the API response to arrive.
Setting up an example
Lets keep with the example of making multiple HTTP requests concurrently. Let’s say we have a RESTful API for our todos, and we want to calculate how many todos are already completed of a given set of todos. So all we need is a list of IDs for our todo items, and we can create the corresponding resource URLs:
todoIDs := []int{1, 42, 84, 99, 155, 81, 22, 98, 102, 7, 2, 3}
func generateTodoUrl(id int) string {
return fmt.Sprintf("https://jsonplaceholder.typicode.com/todos/%d", id)
}
urls := []string{}
for _, todoID := range todoIDs {
url := generateTodoUrl(todoID)
urls = append(urls, url)
}
Having a look at an example response from the todo API, we find the following JSON in the body of the response:
{
"userId": 1,
"id": 1,
"title": "delectus aut autem",
"completed": false
}
We can create a corresponding Todo
struct, containing the fields we want from
the response body:
type Todo struct {
ID int `json:"id"`
Title string `json:"title"`
Completed bool `json:"completed"`
}
With this we can now implement the function fetchTodo
, which fetches
the resource and parses the response body into our Todo
struct.
func fetchTodo(url string) (Todo, error) {
log.Printf("GET %s", url)
resp, err := http.Get(url)
if err != nil {
return Todo{}, fmt.Errorf("failed to fetch data: %v", err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
return Todo{}, fmt.Errorf("failed to read response body: %v", err)
}
var todo Todo
err = json.Unmarshal(body, &todo)
if err != nil {
return Todo{}, fmt.Errorf("failed to unmarshal JSON: %v", err)
}
return todo, nil
}
So far nothing here had anything to do with goroutines. We could now just loop over our resource URLs to fetch all the todos procedually.
var completedTodos int
for _, url := range urls {
// we have to wait for a request to complete before sending out the next
todo, err := fetchTodo(url)
if err != nil {
log.Println("Error fetching todo: %s", err)
continue
}
if todo.completed {
completedTodos++
}
}
But that’s slow and inefficient. To make these requests concurrently, we need two things:
- A way to run our
fetchTodo
function concurrently. For this we can wrap it in an anonymous goroutine. - Receive the response data of the requests concurrently. For this we can use channels.
Let’s tackle the second part first.
Channels
In Go, channels are typed things that can we can send data into, and receive data from. It’s the canonical way to communicate data from concurrently running functions – goroutines. So essentially it lets one goroutine send data to another goroutine. One important thing to note is that the main function in Go is also a goroutine.
We can create a channel using make(chan <channelType>)
. To send data to and
receive it from a channel we use the <-
operator, with the direction of arrow
indicating how the data flows. These two operations are called communications.
channel := make(chan int)
// send a value into a channel
channel <- 42
// receive a value from a channel
value := <- channel
Apart form send and receive, a channel also supports the close operation, which will make the channel stop receive any more values. If we try to send a value to a closed channel, it will panic. But we can still receive from closed channels until they are empty. Any subsequent attempt to receive will return the zero value of the channel’s element type.
Making our program concurrent
So now we can use a channel to collect the data from our goroutines. We give our
channel the element type Result
, which will contain either the received todo
data, or an error
. We also know exactly the amount of requests we are making,
and thus the number of results we want to receive with our channel, so we can
use it to create a buffered channel.
type Result struct {
Data Todo
Err error
}
resultsChan := make(chan Result, len(urls))
Now that we have our channel set up, we can finally start with the bloody
goroutines! Here we can very simply wrap our fetchTodo
function call in an
anonymous function and prepend the go
statement, which will run the prepended
function as concurrently.
for _, url := range urls {
go func(url string) {
data, err := fetchTodo(url)
resultsChan <- Result{Data: data, Err: err}
log.Printf("FETCHED %s", url)
}(url)
}
The above will start up a goroutine for every url we want to request. Once the
fetchTodo
function has received and processed the response, we send the
results into our resultsChan
. Then all that’s left is to collect all the
results from the channel and do something with them – like counting completed
todos and printing the result.
var completedTodos int
for range urls {
result := <-resultsChan
if result.Err != nil {
log.Printf("ERROR %s", result.Err)
}
if result.Data.Completed {
completedTodos++
}
}
fmt.Printf("completed %d/%d", completedTodos, len(todoIDs))
So that’s pretty much the basics of making Go execute functions concurrently.